wget --no-check-certificate --save-cookies cookies --keep-session-cookies \ --post-data="username=example&password=example" \ "https://example.com/index.PHP?title=Special:Userlogin&returntotitle=" wget --no-check-certificate --load-cookies=cookies \ --no-parent -r --level=2 -nc -E \ https://example.com/Special:Sitemap
但是对于DekiWiki网站,如果需要登录,这不起作用.
问题接缝将在man wget中描述
Note: if Wget is redirected after the POST request is completed,it will not send the
POST data to the redirected URL. This is because URLs that process POST often respond
with a redirection to a regular page,which does not desire or accept POST. It is not
completely clear that this behavior is optimal; if it doesn’t work out,it might be
changed in the future.
题
可以使用Perl完成此操作,例如可能是HTML :: TreeBuilder 3或HTML :: TokeParser或Mechanize或任何其他Perl模块?
解决方法
相反,他们发送重定向响应(302 Object Moved),大多数浏览器自动关注,然后在该重定向页面的响应中发送cookie.
我使用curl通过启用curl_opt FOLLOW_LOCATION来执行此操作,对于命令行工具,使用-location选项.它是一个像wget一样的免费工具.
curl --cookie cookie.txt --cookie-jar cookie.txt \ --data-urlencode "username=example&password=example" \ --insecure --location https://example.com/index.PHP?title=Special:Userlogin&returntotitle= -o downloadedfile.html https://example.com/Special:Sitemap
http://curl.haxx.se/download.html
此外,有时登录表单需要多部分/表单数据帖子而不仅仅是application / x-www-form-urlencoded帖子.要使curl执行多部分/表单数据更改,将-data-urlencode更改为-F.