问题描述
输入列的字典列表数量可变,它不是固定的。
INPUT column:
Facilities
[{'name': 'Work from home','icon': 'WFH.svg'}]
[{'name': 'Gymnasium','icon': 'Gym.svg'},{'name': 'Cafeteria','icon': 'Cafeteria.svg'},{'name': 'Work from home','icon': 'WFH.svg'}]
[{'name': 'Free food','icon': 'FreeFood.svg'},{'name': 'Team outings','icon': 'TeamOuting.svg'},{'name': 'Education assistance','icon': 'Education.svg'}]
[{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'},{'name': 'Job training','icon': 'JobTraining.svg'}]
[{'name': 'Free transport','icon': 'Transportation.svg'},'icon': 'WFH.svg'},{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'}]
应该过滤以上输入内容,以便该列仅包含一个列表,其中包含从列表中的不同词典收集的键“ name”的所有值。
Desired Output column:
Facilities
['Work from home']
['Gymnasium','Cafeteria','Work from home']
['Free food','Team outings','Education assistance']
['Soft skill training','Job training']
['Free transport','Work from home','Soft skill training']
解决方法
假设您有此DataFrame:
df = pd.DataFrame({'Facilities':[
[{'name': 'Work from home','icon': 'WFH.svg'}],[{'name': 'Gymnasium','icon': 'Gym.svg'},{'name': 'Cafeteria','icon': 'Cafeteria.svg'},{'name': 'Work from home',[{'name': 'Free food','icon': 'FreeFood.svg'},{'name': 'Team outings','icon': 'TeamOuting.svg'},{'name': 'Education assistance','icon': 'Education.svg'}],[{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'},{'name': 'Job training','icon': 'JobTraining.svg'}],[{'name': 'Free transport','icon': 'Transportation.svg'},'icon': 'WFH.svg'},{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'}],]})
print(df)
Facilities
0 [{'name': 'Work from home','icon': 'WFH.svg'}]
1 [{'name': 'Gymnasium',{'n...
2 [{'name': 'Free food','icon': 'FreeFood.svg'}...
3 [{'name': 'Soft skill training','icon': 'Soft...
4 [{'name': 'Free transport','icon': 'Transport...
然后:
df['Facilities'] = df['Facilities'].apply(lambda x: [d['name'] for d in x])
print(df)
打印:
Facilities
0 [Work from home]
1 [Gymnasium,Cafeteria,Work from home]
2 [Free food,Team outings,Education assistance]
3 [Soft skill training,Job training]
4 [Free transport,Work from home,...
,
您可以通过两个列表理解来提取它:
facility_names = [[facility["name"] for facility in facility_list] for facility_list in facilities]
假设您的输入数据是:
facilities=[
[{'name': 'Work from home','icon': 'SoftSkillsTraining.svg'}]
]