抓取一家公司的谷歌地图评论文本数据

问题描述

我想从一家公司的谷歌地图评论中抓取文本评论数据,以便进行情感分析。但是,我的代码没有运行!我收到错误。我想知道你是否可以指导我解决这个问题。谢谢!

!pip install selenium
!apt-get update 
!apt install chromium-chromedriver

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandBox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver =webdriver.Chrome('chromedriver',chrome_options=chrome_options)

#add your google map link whose data you want to scrape
from selenium import webdriver                       
from bs4 import BeautifulSoup                       
import time                       
import io                       
import pandas as pd

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import io
import pandas as pd
from selenium.webdriver.support.ui import webdriverwait
from selenium.webdriver.support import expected_conditions as EC  
from selenium.webdriver.common.by import By  

driver.get('https://www.google.com/maps/place/Embassy+of+Bangladesh/@38.9418017,-77.0679642,15z/data=!4m7!3m6!1s0x0:0x5621455e7625f36e!8m2!3d38.9418017!4d-77.0679642!9m1!1b1')

wait = webdriverwait(driver,10)
menu_bt = wait.until(EC.element_to_be_clickable(
                       (By.XPATH,'//button[@data-value=\'Sort\']'))
                   )  
menu_bt.click()
recent_rating_bt = driver.find_elements_by_xpath(
                                     '//div[@role=\'menuitem\']')[50]
recent_rating_bt.click()
time.sleep(5)

错误信息:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-35-94b4c6e89470> in <module>()
      5 menu_bt.click()
      6 recent_rating_bt = driver.find_elements_by_xpath(
----> 7                                      '//div[@role=\'menuitem\']')[50]
      8 recent_rating_bt.click()
      9 time.sleep(5)

IndexError: list index out of range

解决方法

您正在访问 find_elements_by_xpath() 返回的列表中按 50 索引的项目。错误信息表明该索引不存在,即返回的列表小于该索引。

您应该在访问之前检查返回列表的长度。