如何正确使用Rselenium等待页面加载?

问题描述

因此,我试图通过此链接来抓取美国各个啤酒厂的名称和位置:

https://www.brewersassociation.org/directories/breweries/

如您所见,HTML需要花费一秒钟的时间来加载。这意味着当我用Rselenium刮擦HTML代码时,它仅加载页面的一半,这是我正在运行的代码,应该为使用Rselenium的任何人复制,

remDr <- RSelenium::remoteDriver(remoteServerAddr = "localhost",port = 4445L,browserName = "chrome")
remDr$open()
remDr$setTimeout(type="page load")
remDr$navigate("https://www.brewersassociation.org/directories/breweries/?location=MI")

remDr$screenshot(display=TRUE)

但是,如果您查看屏幕截图,则仅加载页面的一半。我尝试设置超时和其他一些命令,但是它们似乎不允许页面正确加载。有关如何解决此问题的任何建议或想法?

解决方法

您可以尝试以下方法:

library(RSelenium) 
driver <- rsDriver(browser=c("firefox"),port = 4567L)
remote_driver <- driver[["client"]]
remote_driver$navigate("https://www.brewersassociation.org/directories/breweries/?location=MI")
#You can wait 3 seconds
Sys.sleep(3)
#Now you can scroll down all page and wait for the full page
scroll_d <- remote_driver$findElement(using = "css",value = "body")
#This will scroll the page but is not enough,but is a way to create an automatization. 
#If you scroll the page many times you are able to see all page.
scroll_d$sendKeysToElement(list(key = "end"))
#How? For example you can use the alphabet to monitor the list. 

此答案只是解决问题的一种方法/想法。

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...