问题描述
我的草率应用程序正在输出大量的异常,而我却看不到问题所在,而最后一个使我特别困惑。
在我解释为什么这是链之前:
2020-11-04 17:38:58,394:ERROR:Error while obtaining start requests
Traceback (most recent call last):
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\connectionpool.py",line 670,in urlopen
httplib_response = self._make_request(
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\connectionpool.py",line 426,in _make_request
six.raise_from(e,None)
File "<string>",line 3,in raise_from
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\connectionpool.py",line 421,in _make_request
httplib_response = conn.getresponse()
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\http\client.py",line 1347,in getresponse
response.begin()
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\http\client.py",line 307,in begin
version,status,reason = self._read_status()
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\http\client.py",line 276,in _read_status
raise Remotedisconnected("Remote end closed connection without"
http.client.Remotedisconnected: Remote end closed connection without response
During handling of the above exception,another exception occurred:
Traceback (most recent call last):
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\adapters.py",line 439,in send
resp = conn.urlopen(
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\connectionpool.py",line 726,in urlopen
retries = retries.increment(
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\util\retry.py",line 403,in increment
raise six.reraise(type(error),error,_stacktrace)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\packages\six.py",line 734,in reraise
raise value.with_traceback(tb)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\urllib3\connectionpool.py",in _read_status
raise Remotedisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.',Remotedisconnected('Remote end closed connection without response'))
During handling of the above exception,another exception occurred:
Traceback (most recent call last):
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\shadow_useragent\core.py",line 35,in _update
r = requests.get(url=self.URL)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\api.py",line 76,in get
return request('get',url,params=params,**kwargs)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\api.py",line 61,in request
return session.request(method=method,url=url,**kwargs)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\sessions.py",line 530,in request
resp = self.send(prep,**send_kwargs)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\sessions.py",line 643,in send
r = adapter.send(request,**kwargs)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\requests\adapters.py",line 498,in send
raise ConnectionError(err,request=request)
requests.exceptions.ConnectionError: ('Connection aborted.',another exception occurred:
Traceback (most recent call last):
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\scrapy\core\engine.py",line 129,in _next_request
request = next(slot.start_requests)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\scrapy_splash\middleware.py",line 167,in process_start_requests
for req in start_requests:
File "C:\Users\lguarro\Documents\Work\SearchEngine_Pure\SearchEngine_Pure\spiders\SearchEngine.py",line 36,in start_requests
user_agent = self.ua.random_nomobile
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\shadow_useragent\core.py",line 120,in random_nomobile
return self.pickrandom(exclude_mobile=True)
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\shadow_useragent\core.py",line 83,in pickrandom
self.update()
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\shadow_useragent\core.py",line 59,in update
self._update()
File "C:\Users\lguarro\Anaconda3\envs\virtual_workspace\lib\site-packages\shadow_useragent\core.py",line 38,in _update
self.logger.error(r.content.decode('utf-8'))
UnboundLocalError: local variable 'r' referenced before assignment
现在最后一个例外是抱怨一些
UnboundLocalError:赋值之前引用了本地变量'r'
该跟踪中唯一存在的代码是SearchEngine.py文件,该文件甚至都没有变量“ r”,这让我非常困惑。这是发生错误的start_requests的实现:
def start_requests(self):
user_agent = self.ua.random_nomobile # Exception raised here
rec = self.mh.FindIdleOneWithNoURLs()
if rec:
self.logger.info("Starting url scrape for company,%s using user agent: %s",rec["Company"],user_agent)
script = self.template.substitute(useragent=user_agent,searchquery=rec["Company"])
yield SplashRequest(url=self.url,callback=self.parse,endpoint="execute",args={
'lua_source': script
},Meta={'RecID': rec["_id"],'Company': rec["Company"]},errback = self.logerror
)
它抱怨该函数的第一行,我认为这没问题。
如果相关的话,我还要补充一点,就是我的脚本似乎昨天才运行良好,但是今天我不得不重置Docker配置(启动容器正在运行),从那时起,我一直无法使脚本顺利运行。
解决方法
我发现了导致问题的原因!实际上,实际上没有任何错误,而是shadow-useragent库内部的一个错误。
该库会定期发出API请求,以获取最常用的用户代理列表,与此API相对应的服务器已关闭,shadow-useragent的作者未正确处理该异常。
幸运的是,shadow-useragent确实缓存了最近可以接收的用户代理列表。因此,我的解决方案是编辑shadow-useragent代码以完全绕过更新功能,并使用超出计划的更新的缓存列表(在data.pk文件内部)。如果还有其他人遇到此问题,这是您可以使用的临时解决方案,直到该服务器重新启动并运行。.希望很快!