问题描述
尝试通过以下代码获取完整的推文。我知道您想将参数tweet_mode设置为'extended'值,但是由于我不是此处的标准查询,因此我不知道它适合哪里。对于文本字段,我总是会得到部分文本,并以“ ...”和URL开头。使用此配置,您将如何获得完整的推文:
from twython import Twython,TwythonStreamer
import json
import pandas as pd
import csv
def process_tweet(tweet):
d = {}
d['hashtags'] = [hashtag['text'] for hashtag in tweet['entities']['hashtags']]
d['text'] = tweet['text']
d['user'] = tweet['user']['screen_name']
d['user_loc'] = tweet['user']['location']
return d
# Create a class that inherits TwythonStreamer
class MyStreamer(TwythonStreamer):
# Received data
def on_success(self,data):
# Only collect tweets in English
if data['lang'] == 'en':
tweet_data = process_tweet(data)
self.save_to_csv(tweet_data)
# Problem with the API
def on_error(self,status_code,data):
print(status_code,data)
self.disconnect()
# Save each tweet to csv file
def save_to_csv(self,tweet):
with open(r'tweets_about_california.csv','a') as file:
writer = csv.writer(file)
writer.writerow(list(tweet.values()))
# Instantiate from our streaming class
stream = MyStreamer(creds['CONSUMER_KEY'],creds['CONSUMER_SECRET'],creds['ACCESS_TOKEN'],creds['ACCESS_SECRET'])
# Start the stream
stream.statuses.filter(track='california',tweet_mode='extended')
解决方法
tweet_mode=extended
参数对v1.1流API无效,因为所有Tweet均以扩展格式和默认(140)格式提供。
如果Tweet对象的值为truncated: true
,则有效负载中将有一个附加元素-extended_tweet
。这是full_text
值的存储位置。
请注意,此答案仅适用于v1.1 Twitter API,在v2中,默认情况下,流API中会返回所有Tweet文本(Twython当前不支持v2)。