如何解析多层下的json数据?

问题描述

代码的目标是获取获取的 json 信息并将其解析为原始位置数据(地址、邮政等)。我对编码很陌生,这是我在学习地理时遇到的一个学校项目的一次性任务,需要加拿大所有麦当劳的位置。所以欢迎任何其他学习工具。但是我遇到的主要问题是我想写

for blank in blanks['']:

所以我可以获取 csv 输出的数据。但是我注意到我的数据在多层之下。 例如:

{
    "features": [
        {
            "geometry": {
                "coordinates": [
                    -79.28662,43.68758
                ]
            },"properties": {
                "name": "Vic Park/Gerrard","shortDescription": "VIC PARK/G","longDescription": "VIC PARK/GERRARD","todayHours": "06:00 - 22:00","drivetodayHours": "00:00 - 00:00","id": "195500517230-en-ca","filterType": [
                    "ALL_DAY_BREAKFAST","BAKERY","BREAKFAST","CYT","DRIVETHRU","INDOORDINING","MCCAFE","MOBILEOFFERS","MOBILEORDERS","PARKINGAREA","TWENTYFOURHOURS","WIFI"
                ],"addressLine1": "2480 GERRARD STREET EAST","addressLine2": "","addressLine3": "SCARBOROUGH","addressLine4": "Canada","subDivision": "","postcode": "M1N 4C3","customAddress": "SCARBOROUGH,M1N 4C3","telephone": "4166903659",

我想要的信息在我看来(不确定)的属性下,但我的

for store in stores['features']:

声明。不允许我为 csv 单独获取“addressLine1”信息或其他信息。我想知道是否有人有解析此类数据的解决方案。

P.S 我包含了我的全部代码,以防万一有更深层次的问题。

import requests
import csv
import json

url = "https://www.mcdonalds.com/googleapps/GoogleRestaurantLocAction.do?method=searchLocation&latitude=43.6936965&longitude=-79.2969938&radius=1000000&maxResults=1700&country=ca&language=en-ca&showClosed=&hours24Text=Open%2024%20hr"

payload={}
files={}
headers = {
  'authority': 'www.mcdonalds.com','sec-ch-ua': '" Not;A Brand";v="99","Google Chrome";v="91","Chromium";v="91"','accept': '*/*','x-requested-with': 'XMLHttpRequest','sec-ch-ua-mobile': '?0','user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/91.0.4472.106 Safari/537.36','sec-fetch-site': 'same-origin','sec-fetch-mode': 'cors','sec-fetch-dest': 'empty','referer': 'https://www.mcdonalds.com/ca/en-ca/restaurant-locator.html','accept-language': 'en-GB,en-US;q=0.9,en;q=0.8','cookie': 'bm_sz=C04645E7F7A956C5F9D9C5A20DEAEC97~YAAQ1Cv2SEtfMBN6AQAAItxfEwwTVV2V2Tr7UWpPt1Ps7gl84FzQlmbWIm4kBBh5dxlK3w8RenwiEiKtvERE6dLmrwPwJUuy+14gU/LeEZvP+uxzyBr04oQXdcSEQuiOgdkAGasqnBrTw1mp5E5iehnRpvHBDdSqh8wRSgJV0eG4f8YwSz66BfntCBALtQNCAFK2; _abck=F05779F2345218EA4989FF467D897C5A~0~YAAQ1Cv2SExfMBN6AQAAItxfEwaIwCrBeP25JBhBb7TX+HmnLQgrj1TkosrB+oHSv9ctrxRukqEDUahpL1KkjpqjY1XY1yyulQ0ZRhsEfhY968YVsTOqfiosAu3kykd3pJG/bQ37XHwWs5qXpIdhMXRwJwXmkYtl3ETG8kXK2iZ22Q31COaSjNVACLaa7s9tCk9ItgLvUj5x9Nldjnd8AdXR0pXicrQY1IaruJyNqwMcJv42AUHW7iH4Ex9ZOSYsgEjLMNd44mS525X/gSNUTSOzoqoWsnH4MU59vfgLTwc2hVncAv67LBViTLxbWw4eVAvz7Z5phQfCmvoIy0PD8gy5iwPDMaD3GASrK9xScDPAPUI2wquxmsJ+f2cQaxZQKhvJCeH9cz14OZfx8ksA2ss53E0l0kDvgmnw~-1~-1~-1; ak_bmsc=BA4817D8DEE20E92C1E6251C54FC124348F62BD48F5F00005F91C9608B679D5F~plUkbYfsvYr5dCayJ9dMGEJ3QDgkmkv2mLpE7pCY9vW0xrdawvmyxfSnupw/4F7C48Akdn8PKsBniqz+7F+RZb8v4AkvH3c0RuvnynqJoni+kJcDYtPOxdMvdtGdTlZGIkSQNfpcxHNQDVlzojdSBX0vyBh/8seKQv10U67M7m787olYzg9jnsUwk3/VHBrnMDogiWJT8rNV7saSXunN0pAgucZWo/XhCpTJL+tI9urt0=; MCDCountry_code=US; bm_mi=BEE06312635FD442995BC0237BAFDA7C~f/RxgMW/JJSUc/wB9ZRg9fPD/76+wq/TaoWEZR1/ttrAiVTO256xhDTsVYc/kdHIjWkxvfO4XDcBjqe4hQ4qXt8Anpfi09vna/zcC7l6OVWpWeRSoZNztl7h5VF407L3XG+9CpzjSHNcaqAPRk5d0J5gLMtL/KmR8XBkAC0Syim7ST97nxNrPfLdlkSPMGm4Oy86xvY5PH5Nu47zS/gwhanBFg69tAdrQdaZewE2eGuzoJPsZit3UsihTzhXc4LY92hfSdh3/kZRId+NE8Jp0w==; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN4a9yst3R170rBCm1egzGvCBmB1jq9aCwQm5VgIJgloPOdpiIPfD3kDxFbKhqMuS5U=; JSESSIONID=64PZkBXhhpvNjM4NganzSZ0r1npIIaM7Fo84EsxN.eap7node7; _abck=F05779F2345218EA4989FF467D897C5A~-1~YAAQ1Cv2SExyMBN6AQAA5Et0EwZueCejZbKz1VDGCq2sB43Yx4dq0SiiGeUS6gVpXRIdw3rA3OdpnGHq7tVzQ+IvPpEKwLML9736x1qB5SQxV3jai89y2B2QF6K8nKtyrDAes0qbeTyIrHu0Rh1HLs7CjNxiLi0wswbCZfsspI6fJZiEt+Itre3lfmua/HkhIRwpVTKqlVN5eQ8XIX+s1jJbINx/jUmMTW+jB5k4A5NARGChYH7rJQGYIT/oyZYpSbS3Yweqa4FRgGMW4gYZBN39+t2xSfewADLdpihfOnoZtakw9VhcvAKaf4mEzjB7WEfNJIZSjSE8DzvbJNIF41MGuAhhrnEBwBE8uVCZsA+2qjVPSADVp2Nn8JanJXCbucnLFOLsmPz3oVtGzentht1cHog4+eYOUlmw~0~-1~-1; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN5ZCTzA250oKEeVeXaa6j4gEGJ9RRtrTXQdYXzzSx6fM9aLwif+We2vtIc1yLQgTt4=','dnt': '1'
} 

response = requests.request("GET",url,headers = headers,data = payload,files = files)


stores = json.loads(response.text)

with open('Mcdonlocation.csv',mode='w') as CSVFile:
    writer = csv.writer(CSVFile,delimiter=",",quotechar='"',quoting=csv.QUOTE_MINIMAL)

    writer.writerow([
        "addressLine1","addressLine2","addressLine3","subDivision","postcode","telephone"
        ])

    for store in stores['features']:
        row = []
        Match_Address1= store['properties']["addressLine1"]
        Match_Address2= store['properties']["addressLine2"]
        Match_Address3= store['properties']["addressLine3"]
        subDivision= store['properties']["subDivision"]
        Postalcode= store['properties']["postcode"]
        telephone= store['properties']["telephone"]

        row.append(Match_Address1)
        row.append(Match_Address2)
        row.append(Match_Address3)
        row.append(subDivision)
        row.append(Postalcode)
        row.append(telephone)
        writer.writerow(row)

解决方法

我认为对您的问题的基本答案是“看类型”。 Python json conversion table 告诉您每种类型的期望值。让我们根据 Python 解释器加载您的文件,看看我们有什么:

>>> input = dat.read()
>>> stores = json.loads(input)
>>> type(stores)
<class 'dict'>
>>> type(stores['features'])
<class 'list'>
>>> type( stores['features'][0] )
<class 'dict'>
>>> type( stores['features'][0]['properties'] )
<class 'dict'>
>>> type( stores['features'][0]['properties']['telephone'] )
<class 'str'>
>>> stores['features'][0]['properties']['telephone']
'4166903659'

每个对象都有一个类型;每种类型都有方法。只需按照您的方式进行操作即可。

,

看起来您的数据结构如下:

features:
- geometry:
    coordinates:
    - float
    - float
  properties:
    addressLine1: str
    addressLine2: str
    addressLine3: str
    addressLine4: str
    customAddress: str
    driveTodayHours: str
    filterType:
    - str
    - ...
    id: str
    longDescription: str
    name: str
    postcode: str
    shortDescription: str
    subDivision: str
    telephone: str
    todayHours: str

您想从看起来是商店的各个“功能”元素中提取信息,因此您可以使用像现有逻辑这样的代码有一个良好的开端:

for store in data['features']:
   csv_row = process_store(store)

从那里你只需要决定你想从信息中提取什么,例如:

coord_1 = store['coordinates'][0]
custom_address = store['properties']['customAddress']
...

。与您现有的代码相比,我认为您只是没有注意到有一个 properties 属性可以访问超出初始 features 级别。