问题描述
我正在尝试将下面的JSON结构读入pandas数据帧,但它抛出了错误消息:
ValueError:将字典与非系列混合使用可能会导致顺序不明确。
Json数据: '''
{
"Name": "Bob","Mobile": 12345678,"Boolean": true,"Pets": ["Dog","cat"],"Address": {
"Permanent Address": "USA","Current Address": "UK"
},"Favorite Books": {
"Non-fiction": "Outliers","Fiction": {"Classic Literature": "The Old Man and the Sea"}
}
}
''' 我该如何正确处理?我已经尝试过以下脚本...
'''
j_df = pd.read_json('json_file.json')
j_df
with open(j_file) as jsonfile:
data = json.load(jsonfile)
'''
解决方法
首先从文件中读取json并使用json_normalize
传递到DataFrame.explode
:
import json
with open('json_file.json') as data_file:
data = json.load(data_file)
df = pd.json_normalize(j).explode('Pets').reset_index(drop=True)
print (df)
Name Mobile Boolean Pets Address.Permanent Address \
0 Bob 12345678 True Dog USA
1 Bob 12345678 True cat USA
Address.Current Address Favorite Books.Non-fiction \
0 UK Outliers
1 UK Outliers
Favorite Books.Fiction.Classic Literature
0 The Old Man and the Sea
1 The Old Man and the Sea
编辑:为将值写入句子,您可以选择必要的列,删除重复项,创建numpy数组并循环:
for x,y in df[['Name','Favorite Books.Fiction.Classic Literature']].drop_duplicates().to_numpy():
print (f"{x}’s favorite classical iterature book is {y}.")
Bob’s favorite classical iterature book is The Old Man and the Sea.