问题描述
我想读取我的文本文件;但是,该数据中没有明确的定义符。 我想使用sep ='。 ;获得正确的格式并不完全正确。 例如df = pd.read_table('data.txt',标头=无,sep ='。')。您能给我一些提示,这种情况下的正确分隔符吗?谢谢!
cont1 7.6327.6957.5692.5951.3051.5920.8740.6810.1920.393
cont2 7.5947.6577.5302.3831.4561.5820.8610.6860.2860.340
cont3 7.5557.6237.4872.3701.3511.5900.8680.6830.2840.408
cont4 7.4957.5937.3982.4261.3431.6440.9140.6770.3530.138
cont5 7.4877.5707.4052.3201.4201.5490.9270.6600.2560.357
cont6 7.4417.4987.3842.4481.3611.4880.8780.6380.3330.295
解决方法
您可以使用pandas.read_fwf()
,例如:
In []:
from io import StringIO
s = '''cont1 7.6327.6957.5692.5951.3051.5920.8740.6810.1920.393
cont2 7.5947.6577.5302.3831.4561.5820.8610.6860.2860.340
cont3 7.5557.6237.4872.3701.3511.5900.8680.6830.2840.408
cont4 7.4957.5937.3982.4261.3431.6440.9140.6770.3530.138
cont5 7.4877.5707.4052.3201.4201.5490.9270.6600.2560.357
cont6 7.4417.4987.3842.4481.3611.4880.8780.6380.3330.295'''
pd.read_fwf(StringIO(s),header=None,widths=[21,5,5])
Out[]:
0 1 2 3 4 5 6 7 8 9 10
0 cont1 7.632 7.695 7.569 2.595 1.305 1.592 0.874 0.681 0.192 0.393
1 cont2 7.594 7.657 7.530 2.383 1.456 1.582 0.861 0.686 0.286 0.340
2 cont3 7.555 7.623 7.487 2.370 1.351 1.590 0.868 0.683 0.284 0.408
3 cont4 7.495 7.593 7.398 2.426 1.343 1.644 0.914 0.677 0.353 0.138
4 cont5 7.487 7.570 7.405 2.320 1.420 1.549 0.927 0.660 0.256 0.357
5 cont6 7.441 7.498 7.384 2.448 1.361 1.488 0.878 0.638 0.333 0.295