抓取谷歌学术搜索结果导致错误:无法从谷歌学术中获取

问题描述

我尝试使用 Scholarly 包检索 Google Scholar 搜索结果。(文档:https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator) (例如:https://pypi.org/project/scholarly/搜索了大约 2000 篇论文,我只想从结果中获取标题、期刊和年份信息,将它们保存为 csv 文件。由于我是Python新手,即使看文档也不知道如何实现代码。 (https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator)

from scholarly import scholarly
from scholarly import proxygenerator
import pandas as pd
import numpy as np
from fp.fp import FreeProxy

pg = proxygenerator()
proxy = FreeProxy(rand=True,timeout=1,country_id=[ 'BR','KR','US']).get()
pg.SingleProxy(http =proxy,https =proxy)

pg.Tor_External(tor_sock_port=9050,tor_control_port=9051,tor_password="scholarly_password")

scholarly.use_proxy(pg)

search_query = scholarly.search_pubs('gait AND "machine learning" AND insole',year_low = 2018)

def removekey(d,key):
r = dict(d)
del r[key]
return r


def summary(generator):
info = []
for i in generator:
    info.append(i)

entire = []
for i,v in enumerate(info):
    new = removekey(info[i]['bib'],'author')
    entire.append(new)

total = pd.DataFrame(entire)
return total


summary(search_query)

并导致错误:无法从 Google 学术搜索获取

您可以通过帮助我来挽救我的生命和心理健康..! 谢谢

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)