如何将 GTFS-RT Trip Updates 数据转换为数据帧?

问题描述

我已经使用以下代码以字典格式下载了一些 GTFS-RT Trip Updates 数据:

from google.transit import gtfs_realtime_pb2
import requests
import pandas as pd

Feed = gtfs_realtime_pb2.FeedMessage()
# requests will fetch the results from a url,in this case,the positions of all buses
response = requests.get('link')
Feed.ParseFromString(response.content)

# Use the data as a dict 
from protobuf_to_dict import protobuf_to_dict

# convert to dict from our original protobuf Feed
buses_dict = protobuf_to_dict(Feed)

输出字典是一个有很多嵌套字典的字典。一辆公交车的行程更新格式如下:

id: "14010512942203036"
trip_update {
  trip {
    trip_id: "14010000550082549"
    start_date: "20210120"
    schedule_relationship: SCHEDULED
  }
  stop_time_update {
    stop_sequence: 24
    arrival {
      delay: -20
      time: 1611145420
      uncertainty: 0
    }
    departure {
      delay: 52
      time: 1611145492
      uncertainty: 0
    }
    stop_id: "9022001005006001"
  }
  stop_time_update {
    stop_sequence: 25
    arrival {
      delay: 52
      time: 1611146092
    }
    departure {
      delay: 52
      time: 1611146092
    }
    stop_id: "9022001005007002"
  }
  vehicle {
    id: "9031001004002234"
  }
  timestamp: 1611145514
}

您知道如何以更有用的格式转换这些数据吗?假设熊猫数据框。

先谢谢你!

解决方法

我使用这个网址进行测试:

Drexel_2021_Lecture_1.Rmd

您需要做的就是将此行添加到熊猫数据框脚本的末尾

Drexel_2021_Lecture_2.Rmd

它将把这本字典分成这些列

Drexel_2021_Lecture_3.Rmd