问题描述
我建立了一个异步程序,该程序将检查网站的多个路径上是否存在某个元素。 该程序具有一个基本url,它将获取要检查的域的不同路径,这些路径位于json文件(name.json)中。 如果我要查找的元素存在,则程序应打印出“ 1”。但是我很快意识到,它只选择检查json列表中的最后一项。
import json
import grequests
from bs4 import BeautifulSoup
idlist = json.loads(open('name.json').read())
baseurl = 'https://steamcommunity.com/id/'
for uid in idlist:
fullurl = baseurl + uid
rs = (grequests.get(fullurl) for uid in idlist)
resp = grequests.map(rs)
for r in resp:
soup = BeautifulSoup(r.text,'lxml')
if soup.find('span',class_='actual_persona_name'):
print('1')
else:
print('2')
["xyz","sdasda9229","sdasda923229","sda","sd2","aaaaaa","aaaaaaaaa","aa2092425","aaaa23917"]
解决方法
在将ID附加到基本网址后,就不会存储该ID。您必须存储它并在构建get
请求时传递完整的URL
complete_urls = []
for uid in idlist:
fullurl = baseurl + uid
complete_urls.append(fullurl)
rs = (grequests.get(fullurl) for fullurl in complete_urls)