我的目标是找出r /(subreddit)中其他subreddit用户发布的内容;您可以在下面看到我的代码。它工作得很好,但是我很好奇我是否可以通过以下方式改进它:
首先,限制我的代码,使其仅考虑用户一次(即,不收集同一用户的两次发布历史记录);其次,在提取用户信息之前,每位用户至少添加5条帖子(即,如果用户在他的reddit生活中写了少于5条帖子,那么我的代码将不会考虑他)。
非常感谢!
import pandas as data
import datetime as time
reddit = praw.Reddit(client_id = '#####',client_secret = '#####',username = '#####',password = '#####',user_agent = '#####')
columns = { "User":[],"Subreddit":[],"Title":[],"Description":[],"Timestamp":[]}
for submission in reddit.subreddit("ENTER A SUBREDDIT").new(limit=100):
user = reddit.redditor('{}'.format(submission.author))
for sub in user.submissions.new(limit=100):
columns["User"].append(sub.author)
columns["Subreddit"].append(sub.subreddit)
columns["Title"].append(sub.title)
columns["Description"].append(sub.selftext)
columns["Timestamp"].append(sub.created)