问题描述
我尝试使用 Scholarly 包检索 Google Scholar 搜索结果。(文档:https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator) (例如:https://pypi.org/project/scholarly/) 搜索了大约 2000 篇论文,我只想从结果中获取标题、期刊和年份信息,将它们保存为 csv 文件。由于我是Python新手,即使看文档也不知道如何实现代码。 (https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator)
from scholarly import scholarly
from scholarly import proxygenerator
import pandas as pd
import numpy as np
from fp.fp import FreeProxy
pg = proxygenerator()
proxy = FreeProxy(rand=True,timeout=1,country_id=[ 'BR','KR','US']).get()
pg.SingleProxy(http =proxy,https =proxy)
pg.Tor_External(tor_sock_port=9050,tor_control_port=9051,tor_password="scholarly_password")
scholarly.use_proxy(pg)
search_query = scholarly.search_pubs('gait AND "machine learning" AND insole',year_low = 2018)
def removekey(d,key):
r = dict(d)
del r[key]
return r
def summary(generator):
info = []
for i in generator:
info.append(i)
entire = []
for i,v in enumerate(info):
new = removekey(info[i]['bib'],'author')
entire.append(new)
total = pd.DataFrame(entire)
return total
summary(search_query)
您可以通过帮助我来挽救我的生命和心理健康..! 谢谢
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)