如何使用单个数据字段的语言检测在 json 中添加数据字段键值

问题描述

我有类似的天气警报数据

"alerts": [
    {
        "description": "There is a risk of frost (Level 1 of 2).\nMinimum temperature: ~ -2 \u00b0C","end": 1612522800,"event": "frost","sender_name": "DWD / Nationales Warnzentrum Offenbach","start": 1612450800
    },{
        "description": "There is a risk of widespread icy surfaces (Level 1 of 3).\ncause: widespread ice formation or traces of sNow","end": 1612515600,"event": "widespread icy surfaces",{
        "description": "Es treten Windb\u00f6en mit Geschwindigkeiten um 55 km/h (15m/s,30kn,Bft 7) aus \u00f6stlicher Richtung auf. In exponierten Lagen muss mit Sturmb\u00f6en bis 65 km/h (18m/s,35kn,Bft 8) gerechnet werden.","end": 1612587600,"event": "WINDB\u00d6EN","start": 1612522800
    },

现在我想向每个警报字典添加一个键值对,其中包含从“描述”字段中检测到的语言。我试过了,但无法获得正确的语法...

import json
from langdetect import detect

with open("kiel.json",'r') as f:
    data = json.loads(f.read())

data['ADDED_KEY'] = 'ADDED_VALUE'
#'ADDED_KEY' = 'lang' - should be added as a data field to EVERY alert
#'ADDED_VALUE' = 'en' or 'ger' - should be the detected language [via detect()] from data field 'description' of every alert 

with open("kiel.json",'w') as f:
    f.write(json.dumps(data,sort_keys=True,indent=4,separators=(',',': ')))

实际上我只是在整个文件添加了:

{
"ADDED_KEY": "ADDED_VALUE","alerts": [
    {
        "description": "There is a risk of frost (Level 1 of 2).\nMinimum temperature: ~ -2 \u00b0C",

您能否帮助我以正确的方式完成代码并正确访问正确的数据字段?

进一步:

现在出现的情况是,“警报”不包含在数据字段中(例如,由于天气良好而未传输警报数据时)-我一直想生成该 JSON。我试过了:

for item in data['alerts']:
    if 'alerts' not in data:
        continue
else:
    item['lang'] = detect(item['description'])

但是如果我得到的没有“警报”数据字段

      for item in data['alerts']:
KeyError: 'alerts'

我该如何解决这个问题? “继续”不是正确的任务吗?或者我必须改变 if- 和 for-loop 吗? 再次感谢!

解决方法

您只需要遍历字典键 alerts 并将 key,value 添加到每个 item(这是一个字典)。

for item in data["alerts"]:
    item["ADDED_KEY"] = "ADDED_VALUE"
,

以下作品。迭代警报并添加您提到的键/值。

import json
from langdetect import detect

with open("kiel.json",'r') as f:
    data = json.loads(f.read())

for item in data['alerts']:
    item['lang'] = detect(item['description']) 
#'ADDED_KEY' = 'lang' - should be added as a data field to EVERY alert
#'ADDED_VALUE' = 'en' or 'ger' - should be the detected language [via detect()] from data field 'description' of every alert 

with open("kiel.json",'w') as f:
    f.write(json.dumps(data,sort_keys=True,indent=4,separators=(',',': ')))