RSelenium - 从多个 URL 中抓取

问题描述

我希望在整个 URL 中循环抓取并将其绑定到一个数据帧中。我的问题是我的 for 循环只抓取最后一个 URL。代码如下:

如果该列还包含从中抓取的页面的 URL,以便它可以轻松跟踪,我们也将不胜感激。感谢帮助

library(tidyverse)
library(dplyr)
library(magrittr)


driver <- RSelenium::rsDriver(browser = "chrome",chromever =
                                system2(command = "wmic",args = 'datafile where name="C:\\\\Program Files (x86)\\\\Google\\\\Chrome\\\\Application\\\\chrome.exe" get Version /value',stdout = TRUE,stderr = TRUE) %>%
                                stringr::str_extract(pattern = "(?<=Version=)\\d+\\.\\d+\\.\\d+\\.") %>%
                                magrittr::extract(!is.na(.)) %>%
                                stringr::str_replace_all(pattern = "\\.",replacement = "\\\\.") %>%
                                paste0("^",.) %>%
                                stringr::str_subset(string =
                                                      binman::list_versions(appname = "chromedriver") %>%
                                                      dplyr::last()) %>%
                                as.numeric_version() %>%
                                max() %>%
                                as.character())

url <- c("https://shopee.ph/shop/57465664/search","https://shopee.ph/shop/29990515/search")
remote_driver <- driver[["client"]] 
for (i in 1:(length(url))){
  remote_driver$navigate(paste0(url[[i]]))
  Sys.sleep(1)
  
  name <- remote_driver$findElements(using = 'class',value = 'PFM7lj')
  
  name <- lapply(name,function(x) 
    x$getElementText())
  
  name <- unlist(name)
  
  price <- remote_driver$findElements(using = 'class',value = '_29R_un')
  
  price <- lapply(price,function(x) 
    
    x$getElementText())

  price <- unlist(price)

  #shopee <- cbind(data.frame(name),data.frame(price))
  shopee<- rbind(name,price)
  final <- cbind(data.frame(name),data.frame(price))
}


final

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)