问题描述
有一系列XPath,它们与网页上的职位列表相对应。
例如一个职位的XPath是//*[@id="ctl00_CPH1_vcyS_vsGrid_ctl00_ctl04_Title"]
另一个职位的XPath是//*[@id="ctl00_CPH1_vcyS_vsGrid_ctl00_ctl10_Title"]
其中更改的模式是XPath 04
部分中的数字(例如ctl04
)。
因此,我想编写一个for循环,该循环在XPath上进行迭代,从04到18以1的步长进行循环。我有以下代码:
for (i in seq(from = 04,to = 18,by = 1)) {
title_xpath <- sprintf('//*[@id="ctl00_CPH1_vcyS_vsGrid_ctl00_ctl%g_Title"]',i)
}
我假设通过sprintf
,'%g'
将被for循环中的i值替换(即尝试04,然后05,依此类推),最多18。但这不是不会发生。
有什么想法吗?
编辑:感谢到目前为止的建议。但是,当我运行完整的代码(如下所示)时,它们不起作用:
title_list <- list()
item_count <- 1
for (i in seq(from = 1,by = 1)) {
title_xpath <- sprintf('//*[@id="ctl00_CPH1_vcyS_vsGrid_ctl00_ctl%02d_Title"]',i)
# Find the element on the website and transform it to text directly
job_title <- driver$findElement(using = "xpath",value = title_xpath)$getElementText()[[1]]
# Add the outcome to the list
title_list[[item_count]] <- job_title
item_count <- item_count + 1
}
print(title_list)
其中不起作用的部分与XPath有关。如果将XPath从ctl%02d
更改为ctl04
,位置ctl04的职位将被打印18次。我想要的是代码打印与ctl04
,ctl05
等对应的职位,直到ctl18
。帮助表示赞赏。
解决方法
可能您需要%02d
:
title_xpath <- sprintf('//*[@id="ctl00_CPH1_vcyS_vsGrid_ctl00_ctl%02d_Title"]',4:18)
title_xpath
# [1] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl04_Title\"]"
# [2] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl05_Title\"]"
# [3] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl06_Title\"]"
# [4] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl07_Title\"]"
# [5] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl08_Title\"]"
# [6] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl09_Title\"]"
# [7] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl10_Title\"]"
# [8] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl11_Title\"]"
# [9] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl12_Title\"]"
#[10] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl13_Title\"]"
#[11] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl14_Title\"]"
#[12] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl15_Title\"]"
#[13] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl16_Title\"]"
#[14] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl17_Title\"]"
#[15] "//*[@id=\"ctl00_CPH1_vcyS_vsGrid_ctl00_ctl18_Title\"]"