问题描述
在我的代码的第一部分中,我获得了20个在Tweet中使用过“ gym”一词的用户的列表。这部分工作正常。
在第二部分中,我试图使用在第一部分中获得的用户名,并获取他们最近的20条推文中的每条。
我当前拥有的代码没有运行任何错误,但是它肯定不会返回我在第一部分中获得的每个人的20条推文,它所做的只是从结果中返回最后一行第一部分。
下面是我的代码,如您所见,我试图将在第一部分“ tweets”中创建的列表用作第二部分中的id输入,并且尝试使用[2]只调用列表的第三列(用户名所在的位置)。
import tweepy
from tweepy import OAuthHandler
import pandas as pd
access_token = ''
access_token_secret = ''
consumer_key = ''
consumer_secret = ''
auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
tweets = []
count = 20
for tweet in tweepy.Cursor(api.search,q="gym"+'-filter:retweets',since='2020-02-08',tweet_mode='extended',lang='en').items(count):
try:
data = [tweet.full_text,tweet.user.screen_name]
data = tuple(data)
tweets.append(data)
except tweepy.TweepError as e:
print(e.reason)
continue
except StopIteration:
break
df = pd.DataFrame(tweets,columns=['Tweet','@ Name'])
print(df)
new_tweets = []
username = tweets[1]
count = 20
for user in tweepy.Cursor(api.user_timeline,id=username,tweet_mode='extended').items(count):
try:
data = [tweet.full_text,tweet.user.screen_name]
data = tuple(data)
new_tweets.append(data)
except tweepy.TweepError as e:
print(e.reason)
continue
except StopIteration:
break
df2 = pd.DataFrame(new_tweets,columns=['Tweets','@ Name'])
print(df2)
df2.to_csv('test3.csv')
这是我的输出:
Tweet @ Name
0 Gym chronicles ? chocodilish
1 @neilmcrowther @SpotifyUK I have a Spotify pla... carey_bamber
2 Food pick-up for virtual learners today 9:00-1... allentrotter
3 couldn’t sleep so gym it is esmeraldahdz_
4 We need I.D. to buy beer,to buy ciggies,we n... beryl1946
5 So I actually have to go to the gym to have a ... ___tshego
6 Currently three Marcela Bielsa lookalikes in t... sammyptweet
7 I’m dreading going to the gym and coming back ... cinnamonKayyy
8 yes we were there... what the fuck is going on... blubbsie
9 @IamEzeNwanyi @LilburnEnugu @mr_robmichael @He... _lilivet
10 GYM WEEK 2 LEGO Mondo_92
11 Webinars for this week are as follows,\nBrain ... EdCentreMayo
12 I rather be wakin up for the gym than work tbh. illmindofPAT
13 First day back in the gym doing ?BASKETBALL ? ... AUMWarhawksWBB
14 i don’t wanna go to school today since i know ... CEOofTsuyuAsui
15 @sunikies GYM DHSHSKDSH (i miss it :( ),indiv... shienIove
16 @PaulMumba_ Is that gym work I'm seeing on tha... jaymaxgie
17 Body builders on Instagram don’t go to the gym... OfficialShann_
18 @DivinePooh gym and game room FinesseDee2
19 I use to wake up to go to the gym at this hour... missgenafire
20
Tweets @ Name
0 I use to wake up to go to the gym at this hour... missgenafire
Process finished with exit code 0
非常感谢任何帮助。
解决方法
在第二个for循环中,您仍在使用第一个for循环的 tweet 变量。您应该使用 user 变量。
,几件事:
-
您不会遍历20个不同的用户名。您对其进行了硬编码,以仅使用1
username = tweets[1]
。即使那样,它仍然是('tweetmessage','username')
的元组,所以您希望将字符串放在那里的索引位置1,即'tweets[1][1]
-
您要以
user
的方式进行迭代,然后调用tweet
变量:for user in tweepy.Cursor(api.user_timeline,id=username,tweet_mode='extended').items(count): try: data = [user.full_text,user.user.screen_name] #<-- correct #data = [tweet.full_text,tweet.user.screen_name] #<-- incorrect ...
完整代码:
import tweepy
from tweepy import OAuthHandler
import pandas as pd
access_token = ''
access_token_secret = ''
consumer_key = ''
consumer_secret = ''
auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
tweets = []
count = 20
for tweet in tweepy.Cursor(api.search,q="gym"+'-filter:retweets',since='2020-02-08',tweet_mode='extended',lang='en').items(count):
try:
data = [tweet.full_text,tweet.user.screen_name]
data = tuple(data)
tweets.append(data)
except tweepy.TweepError as e:
print(e.reason)
continue
except StopIteration:
break
df = pd.DataFrame(tweets,columns=['Tweet','@ Name'])
print(df)
new_tweets = []
count = 20
for tweet,username in tweets:
for user in tweepy.Cursor(api.user_timeline,tweet_mode='extended').items(count):
try:
data = [user.full_text,user.user.screen_name]
data = tuple(data)
new_tweets.append(data)
except tweepy.TweepError as e:
print(e.reason)
continue
except StopIteration:
break
df2 = pd.DataFrame(new_tweets,columns=['Tweets','@ Name'])
print(df2)
df2.to_csv('test3.csv')