R Web搜集estest表单Submit_form

问题描述

我是r的新手,对html,xml等不是很了解。我正在尝试抓取一个需要下拉菜单输入内容的网站。这是一篇针对国会议员新闻稿使用文本和情感分析的学术论文。不是程序员,所以请保持柔和!

memberUrl = 'https://grijalva.house.gov/press-releases/'
session <- html_session(memberUrl)
forms <- html_form(session)
yearForm <- forms[[4]]
#--- so far so good (I think) -- and i have successfully scraped sites that don't have drop downs
#--- but here is where I get confused and can't find a good tutorial on forms and submit_form
set_values(yearForm,??? ) #----- get stuck on how to use set_values
submit_form( session,yearForm,???) #--- and here

谢谢!吉姆

解决方法

submit_form无效,可能是因为该表单使用JS提交。解决方法如下:

library(rvest)
memberUrl = 'https://grijalva.house.gov/press-releases/'
session <- html_session(memberUrl)

session <- rvest:::request_POST(session,memberUrl,body = list(
                                  getNewsByyear = "2018" #change the value here,'getNewsByyear' is the name of the dropdown list
                                ))

titles <- read_html(session) %>%
  html_nodes("ul > li > h3") %>%
  html_text()

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...