问题描述
tags categories classification
0 label ['legislative','law,govt and
politics','exe... None
0 document ['legislative',govt and politics','exe... NaN
0 text ['legislative','exe... NaN
0 paper ['legislative',govt and
politics','exe... NaN
0 poster ['legislative','exe... NaN
我想创建一个新的数据框,在其中我可以将上面的数据框折叠为下面的一个,以便将“标签”和“分类”列的列元素转换为单行,并具有列表格式的单个项,例如>
tags categories classification
0 ['label',['legislative',['None','NaN','document',govt and 'NaN','text',politics','exe... 'NaN']
'paper',poster']
*这是df.to_dict()的结果
{'tags': {0: ' letter',1: ' head',2: ' water',3: ' art',4: ' indoors',5: ' flyer',6: ' poster',...},'categories': {0: "['legislative','executive branch','work','society','government']",1: "['unrest and war','religion and spirituality','buddhism']",2: '[]',3: '[]',4: "['unemployment','foreign policy','politics','armed forces']",5: '[]',6: "['sports','wrestling']",'classfication': {0: nan,1: nan,2: nan,3: nan,4: nan,5: nan,6: nan,...}}
解决方法
我没有完全回答您的问题。但是你想要这样的东西吗?
df:
trial_num subject samples
0 1 1 [-1.74,-0.78,-0.11]
1 2 1 [0.86,0.21,-0.01]
2 3 1 [2.04,0.6,-0.79]
3 1 2 [0.52,0.49,1.56]
4 2 2 [0.07,0.84,-1.1]
5 3 2 [0.43,-1.3,1.99]
转换后的df:
trial_num subject samples
0 [1,2,3,1,3] [1,2] [[-1.74,-0.11],[0.86,-0.0...trial_num subject samples
0 [1,-0.0...
import pandas as pd
df = pd.DataFrame(
{'trial_num': [1,3],'subject': [1,2],'samples': [list(np.random.randn(3).round(2)) for i in range(6)]
}
)
df = df.astype(str).apply(','.join).apply(lambda x: x.split(',')).to_frame().T