R Selenium 无法找到元素返回错误 Selenium 消息:无法定位元素

问题描述

我正在从这个“https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange% 26type%3D5%26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&.targetfield=Nachname&targetfield=Nach 我想搜索每个人以收集电子邮件地址。我正在执行以下操作,但找不到提交搜索按钮的方法

#url
uni<-"https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange%26type%3D5%26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&targetfield=Nachname&_form=display"

#people's name
r<-read_html(uni)
name <- r %>%
  html_nodes("a") %>%
  html_text()
name<-name[40:length(name)]
name<-gsub("\n","",name,fixed = T)
name<-gsub("\t",fixed = T)

#people's first link
link <- r %>%
  html_nodes("a") %>%
html_attrs() %>%
  as.character()
link<-link[40:length(link)]
link<-str_split(link,'"')
link<-sapply(link,"[",6)


#create a loop: with R selenium,click on search for each link and get emails which are in the next page

rD <- rsDriver(browser="firefox",port=4545L,verbose=F)
remDr <- rD[["client"]]
#remDr$navigate("https://ki.se/en/research/professors-at-ki")

for (i in 1:lenght(link)) {
  i=1
 #r<- read_html(link[i])
 remDr$navigate(link[i])
 webElem <- remDr$findElement(using = 'xpath','//*+[contains(concat( " ",@class," " ),concat( " ","abstand_search"," " ))]//font//input')
 
 webElem$clickElement()
 
#here i get the error
 

}


解决方法

这里有一些提示。在阅读时,我会使用更快、更直观的 css 选择器来收集链接:

library(rvest)

links <- read_html('https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange%26type%3D5%26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&targetfield=Nachname&_form=display') %>% 
  html_nodes('.regular[name]') %>% 
  html_attr('href')

然后,我会使用相同的策略来定位搜索按钮:

webElem <- remDr$findElement(using = 'css selector','.abstand_search + [value="Suche starten"]')  # this matches for the element which is interactable
  

最后,我会从目标页面获取姓名和电子邮件

name <- remDr$findElement(using = 'css selector','.regular')
email <- remDr$findElement(using = 'css selector','[href*=mail]') # could also take 2nd match for .regular
,

我通过在循环中以下列方式使用 rvest 来解决这个问题

onClick

不是优雅的代码,但它有效。因为名称更复杂,我找不到获得正确 css 或 xpath 的方法。让我知道您是否能想到更优雅、更快速的代码,或者该问题是否只能使用 brute forze 方式解决。