问题描述
|
我意识到这与这个问题非常相似。但是,我有一个CSV文件,该文件的格式总是与我需要用不同顺序写出的列相同,以便将其向下移动到数据处理管道中。如果我的csv文件包含标题和数据,如下所示:
Date,Individual,Plate,Sample,test,QC
03312011,Indiv098,P342,A1,deep,passed
03312011,Indiv113,P352,C3,passed
如何写出与原始输入csv具有相同列但按以下顺序排列的csv文件:
test,QC,Sample
deep,passed,A1
deep,C3
我最初的想法是做这样的事情:
f = open(\'test.csv\')
lines = f.readlines()
for l in lines:
h = l.split(\",\")
a,b,c,d,e,f = h
for line in h:
print e,f,
解决方法
如果几乎没有机会每次输入文件或输出文件都具有相同的布局,那么这是一种更通用的获取“ reorderfunc”的方法:
writenames = \"test,QC,Plate,Sample\".split(\",\") # example
reader = csv.reader(input_file_handle)
writer = csv.writer(output_file_handle)
# don\'t forget to open both files in binary mode (2.x)
# or with `newline=\'\'` (3.x)
readnames = reader.next()
name2index = dict((name,index) for index,name in enumerate(readnames))
writeindices = [name2index[name] for name in writenames]
reorderfunc = operator.itemgetter(*writeindices)
writer.writerow(writenames)
for row in reader:
writer.writerow(reorderfunc(row))
,reorderfunc = operator.itemgetter(4,5,2,3)
...
newrow = reorderfunc(oldrow)
...
,输入为src.csv
:
import csv
with open(\'x.csv\',\'rb\') as i:
with open(\'y.csv\',\'wb\') as o:
r = csv.DictReader(i)
w = csv.DictWriter(o,\'test QC Plate Sample\'.split(),extrasaction=\'ignore\')
w.writeheader()
for a in r:
w.writerow(a)
输出量
test,Sample
deep,passed,P342,A1
deep,P352,C3
,#Use CSV library
import csv
media = {}
files=[\'Online.txt\']
directory = \"C:/directory/\"
rowCnt=0
for file in files:
file=directory+file
with open(file,\'rb\') as f:
reader = csv.reader(f,delimiter=\'|\') #use pipe delimiter
for row in reader:
rowCnt+=1
if (rowCnt % 1000) == 0:
print (\'\"%s\",\"%s\",\"%s\"\')% (row[1],row[4],row[14],row[17],row[18],row[24],row[25],row[28],row[30])