问题描述
我已经在我的电脑上运行了这段代码,它运行良好。但是部署到zyte(scrapinghub)时出现问题
我使用了 scrapy_crawl_once 来防止重复,并且在我的计算机上可以正常工作,但是当我将其上传到 zyte 时,它没有检测到重复。
下面列出了所有文件。
齐特
scrapy_crawl_once.middlewares] Opened crawl database '/scrapinghub/.scrapy/crawl_once/gumtree.sqlite' with 0 existing records
the directory ```
My computer ```NFO: Opened crawl database 'E:\\python\\my projects\\GT\\final\\GT\\New GT\\.scrapy\\crawl_once\\gumtree.sqlite' with 20 existing records
setup.py
# Automatically created by: shub deploy
from setuptools import setup,find_packages
setup(
name = 'project',version = '1.0',packages = find_packages(),entry_points = {'scrapy': ['settings = gumtree.settings']},)
目录
New/
.scrapy/crawl_once/gumtree.splite
gumtree/
__init__.py
items.py
middlewares.py
models.py
pipelines.py
settings.py
spiders/
__init__.py
example.py
templates/
base.html
results.html
__init__.py
requirements.txt
scrapinghub.yml
settings.py
SPIDER_MIDDLEWARES = {
'scrapy_crawl_once.CrawlOnceMiddleware': 100,}
# Enable or disable downloader middlewares
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
'scrapy_crawl_once.CrawlOnceMiddleware': 50,}
scrapinghub.yml
project: 111111
requirements:
file: requirements.txt
要求.txt
sqlAlchemy==1.4.20
PyMysqL==1.0.2
scrapy-crawl-once==0.1.1
itemadapter==0.2.0
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)