问题描述
我正在向网站提出表格请求。该请求已成功发出,但未返回任何数据。
日志:
str.strip()
我的代码:
2020-09-05 22:37:57 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://safer.fmcsa.dot.gov/query.asp> (referer: https://safer.fmcsa.dot.gov/)
2020-09-05 22:37:57 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://safer.fmcsa.dot.gov/query.asp> (referer: https://safer.fmcsa.dot.gov/)
2020-09-05 22:37:59 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://safer.fmcsa.dot.gov/query.asp> (referer: https://safer.fmcsa.dot.gov/)
2020-09-05 22:37:59 [scrapy.core.engine] INFO: Closing spider (finished)
2020-09-05 22:37:59 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
以下是我用于POST REQUEST的一些示例代码。
2146709
273286
120670
2036998
690147
解决方法
我相信您所需要的只是从此处的XPath中删除tbody
:
cargo = response.xpath('(//table[@summary="Cargo Carried"]/tbody/tr)[2]')
使用如下:
cargo = response.xpath('//table[@summary="Cargo Carried"]/tr[2]')
# I also removed the () inside the path because you don't need it,but that didn't cause the problem.
这样做的原因是Scrapy将从页面中解析原始代码,而您的浏览器可能会渲染tbody
,以防它不在源代码中。进一步的信息here。