问题描述
正如标题所示,我正在努力弄清楚如何制作它,以便多行文本块可以放在单个单元格中。至于我正在做的事情的背景,我正在使用Beautiful Soup提取mtDNA序列以及该站点上的其他数据,并将这些值放入csv中。
我尝试使用str.strip('\n')
将文本单行显示,但这没有用,文本也最终流到了下一行。下面是我的程序代码。
import requests
theSequenceLink = 'https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=1877761016&db=nuccore&report=fasta&extrafeat=null&conwithfeat=on&hide-cdd=on&retmode=html&withmarkup=on&tool=portal&log$=seqview&maxdownloadsize=1000000'
res = requests.get(theSequenceLink)
dna_sequence = res.text.strip()
#cleaning up the sequence
split = 'genome'
mtDNA_sequence = dna_sequence.partition(split)[2]
#you can ignore the genbank and haplogroup stuff
f.write(genbank_ID + "," + haplogroup.replace(",","|") + "," + mtDNA_sequence + "\n")
对于解决此问题的任何帮助将不胜感激。
解决方法
问题是dna序列中包含换行符。因此,您将不得不替换换行符。
import requests
theSequenceLink = 'https://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?id=1877761016&db=nuccore&report=fasta&ext
rafeat=null&conwithfeat=on&hide-cdd=on&retmode=html&withmarkup=on&tool=portal&log$=seqview&maxdownloadsize=10
00000'
res = requests.get(theSequenceLink)
dna_sequence = res.text.strip()
#cleaning up the sequence
split = 'genome'
mtDNA_sequence = dna_sequence.partition(split)[2].strip().replace("\n","")
f = open("a.csv","w")
genbank_ID = "hi"
haplogroup = "world"
#you can ignore the genbank and haplogroup stuff
f.write(genbank_ID + "," + haplogroup.replace(",","|") + ",\"" + mtDNA_sequence + "\"\n")
f.close()