问题描述
我正在尝试通过流从特定用户使用tweepy获取实时tweet数据,但是我发现发布tweet的确切时间戳与我的tweepy程序中的打印文本的时间戳存在4秒的延迟。这是正常的还是预期的,或者有什么方法可以使我的代码更高效?谢谢!
# # # # TWITTER STREAMER # # # #
class TwitterStreamer():
"""
Class for streaming and processing live tweets.
"""
def __init__(self):
pass
def stream_tweets(self):
# This handles Twitter authetification and the connection to Twitter Streaming API
listener = TweetListener()
auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN,ACCESS_TOKEN_SECRET)
stream = Stream(auth,listener)
# This line filter Twitter Streams to capture data by the keywords:
stream.filter(follow=['user_id'])
# # # # TWITTER STREAM LISTENER # # # #
class TweetListener(StreamListener):
#This is a basic listener that just prints received tweets
#Only returns the tweets of given user
def on_status(self,status):
if status.user.id_str != 'user_id':
return
print(status.text)
def on_data(self,data):
try:
json_load = json.loads(data)
text = json_load['text']
if 'RT @' not in text:
print(text)
print(datetime.Now())
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True
def on_error(self,status):
print(status)
if __name__ == '__main__':
streamer=TwitterStreamer()
streamer.stream_tweets()
解决方法
这是正确的,是的。延迟取决于许多因素,例如网络连接和位置,但是通常我希望几秒钟的 small 延迟。