用cheerio从html元素中提取两个文本值

问题描述

使用cheerio,将$定义为cheerio对象,我试图从某些元素中获取两个文本(当前价格和原始价格),这些元素在html中仅定义了id。任何线索如何实现这一目标?

这是包含这两个值的html内容代码段,

<div class="buy-Box__element">
   <div class="clp-component-render">
      <div class="clp-component-render">
         <div class="ud-component--course-landing-page-udlite--price-text" ng-non-bindable="">
            <div>
               <div class="price-text--container--Ws-fP udlite-clp-price-text" data-purpose="price-text-container">
                  <div class="price-text--price-part--Tu6MH udlite-clp-discount-price udlite-heading-xl" data-purpose="course-price-text"><span class="udlite-sr-only">Current price</span><span><span>₹700</span></span></div>
                  <div class="price-text--price-part--Tu6MH price-text--original-price--2e-F5 udlite-clp-list-price udlite-text-sm" data-purpose="original-price-container">
                     <div data-purpose="course-old-price-text"><span class="udlite-sr-only">Original Price</span><span><s><span>₹1,280</span></s></span></div>
                  </div>
                  <div class="price-text--price-part--Tu6MH udlite-clp-percent-discount udlite-text-sm" data-purpose="discount-percentage"><span class="udlite-sr-only">discount</span><span>45% off</span></div>
               </div>
            </div>
         </div>
      </div>
   </div>
</div>

使用X-path可以正常工作,但是我想用cheerio实现。还尝试了以下

#(".price-text--price-part--Tu6MH udlite-clp-discount-price udlite-heading-xl udlite-sr-only")[0].innerText
#(".price-text--price-part--Tu6MH udlite-clp-discount-price udlite-heading-xl udlite-sr-only")

解决方法

可以请您试试吗?

html应该是内部html,您可以使用像库一样的puppeteer。类似let html = await page.evaluate(() => document.body.innerHTML);

$('span:contains("Current price")',html).each(function() {
        let CurrentPrice1 = $(this).next().text();
        let CurrentPrice2 = Number(CurrentPrice1.replace(/[^0-9.-]+/g,""));
        console.log(CurrentPrice1); //this with symbol
        console.log(CurrentPrice2); //this for only fetching the numeric value
    });

将原始价格替换为原始价格

,

您可以执行以下操作:

$('span:contains("Current price") + span span').text()