从文本文件扩展数据

问题描述

我有一个带有将近2000条英文推文的文件。看起来像这样：

{"data":[{"no.":"1241583652212862978","created":"2020-03-22T04:33:04.000Z","tweet":"@OHAOregon My friend says we should not reuse masks to combat coronavirus,is that correct?"},{"no.":"1241583655538941959","created":"2020-03-22T04:33:05.000Z","tweet":" I kNow it’s from a few days ago,but these books are in good shape},.......]}

我只想从文本文件中提取推文。如何仅从文本文件中提取鸣叫部分？任何建议都会有所帮助。预先感谢。

解决方法

您的文件为json格式。检查Python的json库，以便您能够提取tweet。 https://docs.python.org/3/library/json.html

假设您使用d来表示对象，它就像这样简单：

tweet = d["data"][0]["tweet"]

如果它对我在您的示例中在shell中所做的示例有所帮助，

>>> d = {'data': [{'no.': '1241583652212862978','created': '2020-03-22T04:33:04.000Z','tweet': '@OHAOregon My friend says we should not reuse masks to combat coronavirus,is that correct?'},{'no.': '1241583655538941959','created': '2020-03-22T04:33:05.000Z','tweet': ' I know it’s from a few days ago,but these books are in good shape'}]}
>>> print(d["data"])
[{'no.': '1241583652212862978',but these books are in good shape'}]
>>> print(d["data"][0]["tweet"])
@OHAOregon My friend says we should not reuse masks to combat coronavirus,is that correct?
>>>

python-3.x text-extraction text-files tweets