Python 出现“self.attrs[key]]”错误

问题描述

我开始在 python 上学习一些复杂的东西,今天我决定使用 BeautifulSoup。当我尝试获取产品的标题时出现问题,我尝试将“.find”更改为“.findAll”但找不到解决方案。有人请帮助我。 这是我的代码

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as Soup
ListaSteam = "https://store.steampowered.com/search/?sort_by=Price_ASC&category1=998%2C996&category2=29"

#PAGINA - OBTENCION - CERRADA
Pagina = uReq(ListaSteam)
PaginaHtml = Pagina.read()
Pagina.close()

#1 PASO
PaginaSoup = Soup(PaginaHtml,"html.parser")
CodigoJuegos = PaginaSoup.find("div",{"id":"search_resultsRows"})
PRUEBA = CodigoJuegos.a.span["title"]
print(PRUEBA)

错误如下:

This is the error:
    `Traceback (most recent call last):
  File "C:\Users\Usuario\Desktop\******",line 14,in <module>
    PRUEBA = CodigoJuegos.a.span["title"]
  File "C:\Users\Usuario\AppData\Local\Programs\Python\python39\lib\site-packages\bs4\element.py",line 1406,in __getitem__
    return self.attrs[key]
KeyError: 'title'

解决方法

首先,您应该使用 PEP8 styling。很难阅读您的代码。

如果您想以最少的代码更改来解决它,请执行以下操作:

PRUEBA = CodigoJuegos.a.span.text

也就是说,我专业地使用(以及其他工具 bs4)抓取网站,并且我会这样做:

import requests
from bs4 import BeautifulSoup

search_url = "https://store.steampowered.com/search"
category1 = ('998','996')
category2 = '29'

params = {
    'sort_by': 'Price_ASC','category1': ','.join(category1),'category2': category2,}

response = requests.get(
    search_url,params=params
)

soup = BeautifulSoup(response.text,"html.parser")
elms = soup.find_all("span",{"class": "title"})

for elm in elms: 
    print(elm.text)

输出:

Barro F
The Last Hope: Trump vs Mafia - North Korea
Ninja Stealth
Tetropunk
Oracle
Legend of Himari
Planes,Bullets and Vodka
Shift
Blast-off
...

如果您已经依赖于 bs4,那么您也可以得到 requests

,

也许你想做:

PRUEBA = CodigoJuegos.a.get_text("title")
,

使用 css 选择器 'spna.title'

CodigoJuegos = PaginaSoup.select('span.title')
for t in CodigoJuegos:
    print(t.text)