我试图只显示标签内的文字,例如:
<span class="listing-row__price ">$71,996</span>
我只想表现出来
“$71,996”
我的代码是:
import requests from bs4 import BeautifulSoup from csv import writer response = requests.get('https://www.cars.com/for-sale/searchresults.action/?mdId=21811&mkId=20024&page=1&perPage=100&rd=99999&searchSource=PAGINATION&showMore=false&sort=relevance&stkTypId=28880&zc=11209') soup = BeautifulSoup(response.text,'html.parser') cars = soup.find_all('span',attrs={'class': 'listing-row__price'}) print(cars)
我哪里错了?
解决方法
要获取标记内的文本,有几种方法,
cars = soup.find_all('span',attrs={'class': 'listing-row__price'}) for tag in cars: print(tag.text.strip())
产量
$71,996 $75,831 $71,412 $75,476 ....
b)使用get_text()
for tag in cars: print(tag.get_text().strip())
c)如果标签内只有该字符串,您也可以使用这些选项
> .string
> .contents [0]
>下一个(tag.children)
> next(tag.strings)
> next(tag.stripped_strings)
即.
for tag in cars: print(tag.string.strip()) #or uncomment any of the below lines #print(tag.contents[0].strip()) #print(next(tag.children).strip()) #print(next(tag.strings).strip()) #print(next(tag.stripped_strings))
输出:
$71,476 $77,001 ...
注意:
.text和.string不一样.如果标记中有其他元素,则.string返回None,而.text将返回标记内的文本.
from bs4 import BeautifulSoup html=""" <p>hello <b>there</b></p> """ soup = BeautifulSoup(html,'html.parser') p = soup.find('p') print(p.string) print(p.text)
None hello there