流式推文直播为什么会有延迟?

问题描述

我正在尝试通过流从特定用户使用tweepy获取实时tweet数据,但是我发现发布tweet的确切时间戳与我的tweepy程序中的打印文本的时间戳存在4秒的延迟。这是正常的还是预期的,或者有什么方法可以使我的代码更高效?谢谢!

# # # # TWITTER STREAMER # # # #
class TwitterStreamer():
    """
    Class for streaming and processing live tweets.
    """
    def __init__(self):
        pass


    def stream_tweets(self):
        # This handles Twitter authetification and the connection to Twitter Streaming API
        listener = TweetListener()
        auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
        auth.set_access_token(ACCESS_TOKEN,ACCESS_TOKEN_SECRET)
        stream = Stream(auth,listener)

        # This line filter Twitter Streams to capture data by the keywords: 
        stream.filter(follow=['user_id'])


# # # # TWITTER STREAM LISTENER # # # #
class TweetListener(StreamListener):
    
    #This is a basic listener that just prints received tweets 
    
     #Only returns the tweets of given user
    def on_status(self,status):
        if status.user.id_str != 'user_id':
            return
        print(status.text)

    def on_data(self,data):
        try:
            json_load = json.loads(data) 
            text = json_load['text']
            if 'RT @' not in text:
                print(text)
                print(datetime.Now()) 
            return True
        except BaseException as e:
            print("Error on_data %s" % str(e))
        return True
          

    def on_error(self,status):
        print(status)

if __name__ == '__main__':
    
    streamer=TwitterStreamer()
    streamer.stream_tweets()

解决方法

这是正确的,是的。延迟取决于许多因素,例如网络连接和位置,但是通常我希望几秒钟的 small 延迟。