将一列词典列表转换为一列列表,以使值从列表中每个词典下的键“名称”派生

问题描述

输入列的字典列表数量可变,它不是固定的。

INPUT column:

Facilities
[{'name': 'Work from home','icon': 'WFH.svg'}]
[{'name': 'Gymnasium','icon': 'Gym.svg'},{'name': 'Cafeteria','icon': 'Cafeteria.svg'},{'name': 'Work from home','icon': 'WFH.svg'}]
[{'name': 'Free food','icon': 'FreeFood.svg'},{'name': 'Team outings','icon': 'TeamOuting.svg'},{'name': 'Education assistance','icon': 'Education.svg'}]
[{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'},{'name': 'Job training','icon': 'JobTraining.svg'}]
[{'name': 'Free transport','icon': 'Transportation.svg'},'icon': 'WFH.svg'},{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'}]

应该过滤以上输入内容,以便该列仅包含一个列表,其中包含从列表中的不同词典收集的键“ name”的所有值。

Desired Output column:

Facilities
['Work from home']
['Gymnasium','Cafeteria','Work from home']
['Free food','Team outings','Education assistance']
['Soft skill training','Job training']
['Free transport','Work from home','Soft skill training']

解决方法

假设您有此DataFrame:

df = pd.DataFrame({'Facilities':[
[{'name': 'Work from home','icon': 'WFH.svg'}],[{'name': 'Gymnasium','icon': 'Gym.svg'},{'name': 'Cafeteria','icon': 'Cafeteria.svg'},{'name': 'Work from home',[{'name': 'Free food','icon': 'FreeFood.svg'},{'name': 'Team outings','icon': 'TeamOuting.svg'},{'name': 'Education assistance','icon': 'Education.svg'}],[{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'},{'name': 'Job training','icon': 'JobTraining.svg'}],[{'name': 'Free transport','icon': 'Transportation.svg'},'icon': 'WFH.svg'},{'name': 'Soft skill training','icon': 'SoftSkillsTraining.svg'}],]})

print(df)

                                          Facilities
0    [{'name': 'Work from home','icon': 'WFH.svg'}]
1  [{'name': 'Gymnasium',{'n...
2  [{'name': 'Free food','icon': 'FreeFood.svg'}...
3  [{'name': 'Soft skill training','icon': 'Soft...
4  [{'name': 'Free transport','icon': 'Transport...

然后:

df['Facilities'] = df['Facilities'].apply(lambda x: [d['name'] for d in x])
print(df)

打印:

                                          Facilities
0                                   [Work from home]
1             [Gymnasium,Cafeteria,Work from home]
2    [Free food,Team outings,Education assistance]
3                [Soft skill training,Job training]
4  [Free transport,Work from home,...
,

您可以通过两个列表理解来提取它:

facility_names = [[facility["name"] for facility in facility_list] for facility_list in facilities]

假设您的输入数据是:

facilities=[
[{'name': 'Work from home','icon': 'SoftSkillsTraining.svg'}]
]