Beautiful Soup find_all for 循环只返回第一个元素

问题描述

我正在通过获取最后一页并使用它来制作一系列要循环的页面来查找包含 airbnb 列表的总页数。

当使用 find_all(loc,{class:id}) 然后尝试获取该部分中的所有页码时,我只返回第一行(第一页) 下图显示了我想要获取文本的行,因此我可以找到最大数量(在本例中为 10)。
the rows I want to access

当我在该课上找到所有关于 div内容时,它只给出第一个页码行,以及 a,aria-label=Next

我一直在玩弄下面代码的多种变体,但它总是只返回页码的第一行(2):

import requests
from bs4 import BeautifulSoup

制作可编辑的抓取参数

#checkin and checkout dates
checkin_checkout = ['checkin=2021-05-28&checkout=2021-05-30']
#number of adults for the listing to support
adults = 12
#total beds for the listing
n_beds = adults//2

获取网址

# url I am using    
nearby = '''https://www.airbnb.com/s/homes?tab_id=
        home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=july&flexible_trip_dates%5B%5D=june&flexible_trip_lengths
        %5B%5D=weekend_trip&date_picker_type=calendar&
        location_search=NEARBY&
        {}&
        adults={}&
        source=structured_search_input_header&search_type=filter_change&room_types%5B%5D=
        Entire%20home%2Fapt&place_id=ChIJu-A79dZz44kRGu2B8kV8ylQ&
        min_beds={}'''.format(checkin_checkout,adults,n_beds)
        
res = requests.get(nearby)
print(res.status_code)

没有返回我想要的部分

#trying to access the html that holds the page numbers range
# shows up like this as buttons on the bottom of the page (1,2,3,4,5 ... 10)
div = soup.find_all('div',{'class': '_jro6t0'}) 
for row in div:
    print(row.find_all('a',{'class': '_1y623pm'}))

我尝试了这段代码,但它仍然只打印第一行页码,类 ID 为 _1y623pm,文本为 2

解决方法

# This would like goes each div boxes
# div = soup.find_all('div',{'class': '_jro6t0'})
# for row in div:
#     This find only one result of each div-tag but I think it gives only one 
#     of it - like the image.
#     row.find_all('a',{'class': '_1y623pm'})

# First find all div-tags with classname:
div = soup.find_all('div',{'class': '_jro6t0'}) 
print(div)
# Then find innerhit the found div-tags all a-tags with classname:
a = div.find_all('a',{'class': '_1y623pm'})
for row in a:
    print(row.text)
    print(row.attrs)

@BuddyBob:我在帖子上写评论。更多还是够了?