问题描述
我是python的新手,并且有一个如下所示的pandas数据框:
df =
sn sent ent
0 ['an','apple','is','an','example','of','what?'] ['O','F','0','O','O']
1 ['a','potato','V','O']
我想创建另一个如下所示的熊猫数据框:
newdf=
sn sent ent
0 an O
apple F
is O
an O
example O
of O
what? O
1 a O
potato V
is O
an O
example O
of O
what? O
df.set_index('sn')
.stack()
.str.split(expand=True)
.stack()
.unstack(level=1)
.reset_index(level=0,drop=0)
它接近我想要的,但似乎可以弄清楚其余部分
sn sent ent
0 ['an',['O',0 'apple',0 'is',0 'an',0 'example',0 'of',0 'what?',1 'a',1 'potato',1 'is',1 'an',1 'example',1 'of',1 'what?'] 'O']
非常感谢任何指针
解决方法
df = pd.DataFrame({'sn': [0,1],'sent': [['an','apple','is','an','example','of','what?'],['a','potato','what?']],'ent': [['O','F','0','O','O'],['O','V','O']]})
df.apply(pd.Series.explode).set_index('sn')
结果:
sent ent
sn
0 an O
0 apple F
0 is 0
0 an 0
0 example 0
0 of O
0 what? O
1 a O
1 potato V
1 is 0
1 an 0
1 example 0
1 of O
1 what? O