问题描述
我正在尝试通过https://www.adecco.ch/en-us/job-results进行爬网,但是我无法从此页面加载html,因为它无法加载html文档中的任何东西。
select sum(case when MyVar = 'Yes' then cnt else 0 end) as yes,sum(case when MyVar = 'Yes' then cnt else 0 end) as no,sum(case when MyVar = 'Yes' then cnt else 0 end) / sum(case when MyVar = 'No' then cnt else 0 end) as ratio
from t;
解决方法
正如我的评论中提到的那样,在尝试加载站点之前,该站点的内容已被压缩回去并且未进行解压缩,因此,您基本上是在加载乱码。这段代码应该可以正常工作-
var handler = new HttpClientHandler();
// this is the important bit
handler.AutomaticDecompression = System.Net.DecompressionMethods.All;
var httpClient = new HttpClient(handler);
var html = await httpClient.GetStringAsync(url);
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
var divs = htmlDocument.DocumentNode.Descendants().ToList();