Python Selenium获取页面标题

问题描述

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import webdriverwait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
driver.get("https://hapondo.qa/rent/doha/apartments/studio")
element = webdriverwait(driver,10).until(
    EC.presence_of_element_located((By.XPATH,"/html/head/title"))
)

print(element.text)

在无头选项下无法获取页面标题?试图等待,甚至尝试driver.title

解决方法

您需要注意以下几点:

  • 要检索页面标题而不是使用,您需要使用driver.title
  • hapondo网站包含启用了JavaScript的元素。

解决方案

要提取页面标题,您需要为title_contains()引入WebDriverWait,并且可以使用以下Locator Strategy中的任何一个:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.ChromeOptions() 
    options.add_argument('--headless')
    options.add_argument('--window-size=1920,1080')
    driver = webdriver.Chrome(options=options,executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get('https://hapondo.qa/rent/doha/apartments/studio')
    WebDriverWait(driver,10).until(EC.title_contains("hapondo"))
    print(driver.title)
    
  • 控制台输出:

    Studio Apartments for rent in Doha | hapondo
    

参考文献

您可以在以下位置找到几个相关的详细讨论:

,

通过“页面标题”,我假设您的意思是显示在浏览器顶部选项卡上的文本。

几乎不需要修改代码的解决方案:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Firefox(executable_path=r"[path]")
driver.get("https://hapondo.qa/rent/doha/apartments/studio")


element = WebDriverWait(driver,10).until(
        EC.presence_of_element_located((By.XPATH,"/html/head/title"))
    )

print(element.get_attribute("innerHTML"))
Output: Studio Apartments for rent in Doha | hapondo

获取该文本的另一种方法是仅使用driver.title

title方法用于检索用户当前正在处理的网页的标题。”

来源:GeeksForGeeks

from selenium import webdriver
import time
driver = webdriver.Firefox(executable_path=r"[PATH]")
driver.get("https://hapondo.qa/rent/doha/apartments/studio")
time.sleep(2)

print(driver.title)
#Output: Studio Apartments for rent in Doha | hapondo

变化很小的替代解决方案: