Tweepy 双刮

问题描述

我一直在使用 tweepy 抓取 twitter 大约 9 个月。上周五,我的抓取工具停止工作,因为它做了两件事:1)当用户个人资料中存在推文时,它开始返回一个空列表而不是以前的推文 2)当只应抓取最新的推文时,抓取旧的推文.有没有人遇到过同样的问题?任何建议的修复表示赞赏!

def get_tweets(username):
    # Authorization to consumer key and consumer secret
    auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
    # Access to user's access key and access secret
    auth.set_access_token(access_key,access_secret)
    # Calling api

    api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
    text_of_tweet = None
    tweet_id = None
 
   number_of_tweets = 1
    # Scrape the most recent tweet on the users timeline
    tweet = api.user_timeline(screen_name=username,count=number_of_tweets,include_rts=False)


    # Check if string all ascii
    for item in tweet:
        text_of_tweet = item.text
        tweet_id = item.id


    if (all(ord(c) < 128 for c in text_of_tweet)) == False:
        text_of_tweet = conv_true_ascii(text_of_tweet)

    list_of_sentences = re.split(r'(?<=[^A-Z].[.?]) +(?=[A-Z])',text_of_tweet)
    text_of_tweet = list_of_sentences[0]
    text_of_tweet = text_of_tweet.split('\n')[0]

    # Write to CSV
    # csvWriter.writerow([text_of_tweet,tweet_time,tweet_id])

    # Return tweet
    return text_of_tweet,tweet_id

def conv_true_ascii(single_tweet):
    edit_start = single_tweet.encode('ascii',errors='ignore')
    edited_tweet = edit_start + b'' * (len(single_tweet) - len(edit_start))
    edited_tweet = str(edited_tweet)
    edited_tweet = edited_tweet.replace("b'",'')
    edited_tweet = edited_tweet.replace(edited_tweet[-1],'')

    return edited_tweet


解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)