如何从熊猫数据帧创建嵌套字典？

问题描述

我有以下包含数千行的 Pandas 数据框：

import pandas
...
print(df)

   FAVORITE_FOOD   FAVORITE_DRINK  ...     USER_A        USER_B
0       hamburgers      cola    ...          John          John
1       pasta       lemonade    ...          John          John
2       omelette      coffee    ...          John          John
3       hotdogs         beer    ...          Marie         Marie
4       pizza           wine    ...          Marie         Marie
7       popcorn           oj    ...          Adam          Adam
8       sushi         sprite    ...          Adam          Adam
...
...

我想创建一个嵌套字典，其中人名是键，他们的食物/饮料组合字典是值。

像这样：

dict = {John : {hamburgers : cola,pasta : lemonade,omelette : coffee},Marie : {hotdogs : beer,pizza : wine},Adam : {popcorn : oj,sushi : sprite} 
            }

解决方法

您可以通过以下方式创建所需的字典

dict1 = {}

for i in range(len(df)):
  row = df.iloc[i,:]
  dict1.setdefault(row["USER_A"],{}).update({row["FAVORITE_FOOD"] : row["FAVORITE_DRINK"]})

我使用 setdefault 方法最初创建空字典，然后附加其他字典作为值。

我用下面的代码解决了这个问题：

import pandas as pd

# this line groups user ID with their favorite food and drink
group_dict = {k: f.groupby('FAVORITE_FOOD')['FAVORITE_DRINK'].apply(list).to_dict() for k,f in df.groupby('USER_A')}

# then we use dictionary comprehension to create the desired nested dictionary
nested_dict = {outer_k: {inner_k : {inner_v for inner_v in v if inner_k != inner_v} for inner_k,v in outer_v.items()} for outer_k,outer_v in group_dict.items()}

data-structures dictionary pandas pandas python