Awk获取包含逗号和换行符的.csv列

我有一个.csv列中的数据，有时包含逗号和换行符。如果在我的数据中有一个逗号，我用双引号括住整个string。我怎么去parsing这个列的输出到一个.txt文件考虑换行符和逗号。

不适用于我的命令的示例数据：

,"This is some text with a,in it.",#data with commas are enclosed in double quotes,line 1 of data line 2 of data,#data with a couple of newlines,"Data that may a have,in it and also be on a newline as well.",

这是我到目前为止：

awk -F ""*,"*" '{print $4}' file.csv > column_output.txt

awk中的两个文件数字比较

如何在awk命令中插入shellvariables

使用awkparsing文本文件

sed：匹配包含换行符的string

如何使用AWK从Web日志中收集IP和用户代理信息？

Bash和awk – 如何在使用多行模式math时将variables传递给awk？

Sed / awk：alignment文件中的单词

正则expression式匹配两个string之间所有字符的最后一次出现

在这个awk命令中应该设置什么“RS”

按照字段数量将文件分割成更小的文件

$ cat decsv.awk BEGIN { FPAT = "([^,]*)|("[^"]+")"; OFS="," } { # create strings that cannot exist in the input to map escaped quotes to gsub(/a/,"aA") gsub(/\"/,"aB") gsub(/""/,"aC") # prepend prevIoUs incomplete record segment if any $0 = prev $0 numq = gsub(/"/,"&") if ( numq % 2 ) { # this is inside double quotes so incomplete record prev = $0 RT next } prev = "" for (i=1;i<=NF;i++) { # map the replacement strings back to their original values gsub(/aC/,"""",$i) gsub(/aB/,"\"",$i) gsub(/aA/,"a",$i) } printf "Record %d:n",++recNr for (i=0;i<=NF;i++) { printf "t$%d=<%s>n",i,$i } print "#######"

。

$ awk -f decsv.awk file Record 1: $0=<,#data with commas are enclosed in double quotes> $1=<> $2=<"This is some text with a,in it."> $3=< #data with commas are enclosed in double quotes> ####### Record 2: $0=<,"line 1 of data line 2 of data",#data with a couple of newlines> $1=<> $2=<"line 1 of data line 2 of data"> $3=< #data with a couple of newlines> ####### Record 3: $0=<,> $1=<> $2=<"Data that may a have,in it and also be on a newline as well."> $3=<> ####### Record 4: $0=<,"Data that "may" a have ""quote"" in it and also be on a newline as well.",> $1=<> $2=<"Data that "may" a have ""quote"" in it and also be on a newline as well."> $3=<> #######

以上使用GNU awk FPAT和RT。我不知道有什么CSV格式可以让你在没有引号的字段中间有一个换行符（如果是的话，你永远不会知道任何记录结束的地方），所以脚本不允许那。以上是在这个输入文件上运行的：

$ cat file,

Awk获取包含逗号和换行符的.csv列

相关文章