问题描述
我正在尝试使用cheerio抓取网站
const rp = require('request');
const cheerio = require('cheerio');
rp('https://www.fideyo.com/list',(error,response,html) =>
{
if(!error && response.statusCode == 200)
{
const $ = cheerio.load(html);
console.log($.html());
}
});
但它返回一个不完整的 html 正文,如
<body>
<div id="app"></div>
<script type="text/javascript" src="https://cdn.fideyo.com/static/main.js?v=13"></script>
<!-- Google Tag Manager (noscript) -->
<noscript>
<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KBGVCP3"
height="0" width="0" style="display:none;visibility:hidden">
</iframe>
</noscript>
<!-- End Google Tag Manager (noscript) -->
</body></html>
when i load site from chrome there is content in app section
如何访问应用部分的内容?
解决方法
如果我理解正确,应用部分的内容可能是由 JavaScript 动态创建的,而 kubelet --help
也不运行 JavaScript,只是解析硬编码的 HTML。
你需要类似 https://github.com/puppeteer/puppeteer/ 的东西:
systemctl daemon-reload
systemctl restart kubelet
kubectl get nodes -o wide
文档在这里:https://github.com/puppeteer/puppeteer/blob/main/docs/api.md