问题描述
我正在尝试将从API抓取的一些数据存储到数据帧,然后将其写入.csv。这通常可以工作,但是脚本有时会因以下错误消息而中断:
AssertionError:传递了16列,传递的数据有17列
有人知道这里发生了什么吗?代码在下面-在“通过”之后会中断
from psaw import PushshiftAPI
import datetime as dt
import pandas as pd
api = PushshiftAPI()
start_epoch=int(dt.datetime(2018,6,2).timestamp())
end_epoch=int(dt.datetime(2018,12,31).timestamp())
subreddit = input('Which subreddit would you like to scrape? ')
submission_results = list(api.search_submissions(after=start_epoch,before=end_epoch,subreddit=subreddit,filter=['id','title','subreddit','num_comments','score','author','is_original content','is_self','stickied','selftext','created_utc','locked','over_18','permalink','upvote_ratio','url'],limit = None))
print ('pass one')
submission_results_df = pd.DataFrame(submission_results)
print ('pass two')
submission_results_df.fillna('NULL')
print('pass three')
submission_results_df.to_csv('D:/CAMER/%s_Submittisons-%s-%s.csv'.format(start_epoch,end_epoch) %(subreddit,start_epoch,end_epoch))
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)