python – 在Windows服务器上使用scrapy进行线程阻塞

我在 Windows服务器上运行以下代码时出错

scrapy shell "http://www.yahoo.com"

但我对没有重定向到https的网站没有问题
我认为问题在于线程阻塞.请有人帮帮我

这是错误消息

C:\Documents and Settings\mahyar>scrapy shell "http://www.yahoo.com"
2014-03-03 15:49:38-0600 [scrapy] INFO: Scrapy 0.22.2 started (bot: scrapybot)
2014-03-03 15:49:38-0600 [scrapy] INFO: Optional features available: ssl,http11
2014-03-03 15:49:38-0600 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL
': 0}
2014-03-03 15:49:38-0600 [scrapy] INFO: Enabled extensions: TelnetConsole,Close
Spider,WebService,CoreStats,SpiderState
2014-03-03 15:49:38-0600 [scrapy] INFO: Enabled downloader middlewares: HttpAuth
Middleware,DownloadTimeoutMiddleware,UserAgentMiddleware,RetryMiddleware,Def
aultHeadersMiddleware,MetaRefreshMiddleware,HttpCompressionMiddleware,Redirec
tMiddleware,CookiesMiddleware,ChunkedTransferMiddleware,DownloaderStats
2014-03-03 15:49:38-0600 [scrapy] INFO: Enabled spider middlewares: HttpErrorMid
dleware,OffsiteMiddleware,RefererMiddleware,UrlLengthMiddleware,DepthMiddlew
are
2014-03-03 15:49:38-0600 [scrapy] INFO: Enabled item pipelines:
2014-03-03 15:49:38-0600 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:602
3
2014-03-03 15:49:38-0600 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2014-03-03 15:49:38-0600 [default] INFO: Spider opened
2014-03-03 15:49:38-0600 [default] DEBUG: Redirecting (301) to <GET https://www.
yahoo.com/> from <GET http://www.yahoo.com>
Traceback (most recent call last):
  File "c:\Python27\lib\runpy.py",line 162,in _run_module_as_main
    "__main__",fname,loader,pkg_name)
  File "c:\Python27\lib\runpy.py",line 72,in _run_code
    exec code in run_globals
  File "c:\Python27\lib\site-packages\scrapy\cmdline.py",line 168,in <module>
    execute()
  File "c:\Python27\lib\site-packages\scrapy\cmdline.py",line 143,in execute
    _run_print_help(parser,_run_command,cmd,args,opts)
  File "c:\Python27\lib\site-packages\scrapy\cmdline.py",line 89,in _run_print
_help
    func(*a,**kw)
  File "c:\Python27\lib\site-packages\scrapy\cmdline.py",line 150,in _run_comm
and
    cmd.run(args,opts)
  File "c:\Python27\lib\site-packages\scrapy\commands\shell.py",line 50,in run

    shell.start(url=url,spider=spider)
  File "c:\Python27\lib\site-packages\scrapy\shell.py",line 45,in start
    self.fetch(url,spider)
  File "c:\Python27\lib\site-packages\scrapy\shell.py",line 90,in fetch
    reactor,self._schedule,request,spider)
  File "c:\Python27\lib\site-packages\twisted\internet\threads.py",line 122,in
 blockingCallFromThread
    result.raiseException()
  File "<string>",line 2,in raiseException
OverflowError: integer 2147486719 does not fit '32-bit int'

解决方法

看起来您正在运行32位版本的Windows,而Scrapy需要64位操作系统.

python – 在Windows服务器上使用scrapy进行线程阻塞

解决方法

相关文章