Awk获取包含逗号和换行符的.csv列

我有一个.csv列中的数据,有时包含逗号和换行符。 如果在我的数据中有一个逗号,我用双引号括住整个string。 我怎么去parsing这个列的输出一个.txt文件考虑换行符和逗号。

不适用于我的命令的示例数据:

,"This is some text with a,in it.",#data with commas are enclosed in double quotes,line 1 of data line 2 of data,#data with a couple of newlines,"Data that may a have,in it and also be on a newline as well.",

这是我到目前为止:

awk -F ""*,"*" '{print $4}' file.csv > column_output.txt

awk中的两个文件数字比较

如何在awk命令中插入shellvariables

使用awkparsing文本文件

sed:匹配包含换行符的string

如何使用AWK从Web日志中收集IP和用户代理信息?

Bash和awk – 如何在使用多行模式math时将variables传递给awk?

Sed / awk:alignment文件中的单词

正则expression式匹配两个string之间所有字符的最后一次出现

在这个awk命令中应该设置什么“RS”

按照字段数量文件分割成更小的文件

$ cat decsv.awk BEGIN { FPAT = "([^,]*)|("[^"]+")"; OFS="," } { # create strings that cannot exist in the input to map escaped quotes to gsub(/a/,"aA") gsub(/\"/,"aB") gsub(/""/,"aC") # prepend prevIoUs incomplete record segment if any $0 = prev $0 numq = gsub(/"/,"&") if ( numq % 2 ) { # this is inside double quotes so incomplete record prev = $0 RT next } prev = "" for (i=1;i<=NF;i++) { # map the replacement strings back to their original values gsub(/aC/,"""",$i) gsub(/aB/,"\"",$i) gsub(/aA/,"a",$i) } printf "Record %d:n",++recNr for (i=0;i<=NF;i++) { printf "t$%d=<%s>n",i,$i } print "#######"

$ awk -f decsv.awk file Record 1: $0=<,#data with commas are enclosed in double quotes> $1=<> $2=<"This is some text with a,in it."> $3=< #data with commas are enclosed in double quotes> ####### Record 2: $0=<,"line 1 of data line 2 of data",#data with a couple of newlines> $1=<> $2=<"line 1 of data line 2 of data"> $3=< #data with a couple of newlines> ####### Record 3: $0=<,> $1=<> $2=<"Data that may a have,in it and also be on a newline as well."> $3=<> ####### Record 4: $0=<,"Data that "may" a have ""quote"" in it and also be on a newline as well.",> $1=<> $2=<"Data that "may" a have ""quote"" in it and also be on a newline as well."> $3=<> #######

以上使用GNU awk FPAT和RT。 我不知道有什么CSV格式可以让你在没有引号的字段中间有一个换行符(如果是的话,你永远不会知道任何记录结束的地方),所以脚本不允许那。 以上是在这个输入文件上运行的:

$ cat file,

相关文章

Java中的String是不可变对象 在面向对象及函数编程语言中,不...
String, StringBuffer 和 StringBuilder 可变性 String不可变...
序列化:把对象转换为字节序列的过程称为对象的序列化. 反序...
先说结论,是对象!可以继续往下看 数组是不是对象 什么是对...
为什么浮点数 float 或 double 运算的时候会有精度丢失的风险...
面试题引入 这里引申出一个经典问题,看下面代码 Integer a ...