Google表格：使用ImportXML从网站导入数字

问题描述

我没有编码经验！

我无法将网站上的数据抓取到Google电子表格中。我想将观察号输入到我的电子表格this page

中

我已经尝试过了，但是老实说我不知道我在做什么：

=IMPORTXML(A3,"//*[@id="obsstatcol"]/div/div[1]")

使用A3作为上面的页面URL，其余的是我从Xpath副本中发现的一些教程的混搭，这些观测值是我试图从页面上刮取的。

有人可以帮我弄清楚我到底想做什么，并提供一些建议吗？

预先感谢

解决方法

好尝试！但是，不幸的是，直到页面加载之后才确定观察数。这意味着您的公式：

=IMPORTXML(A3,"//*[@id=""obsstatcol""]/div/div[1]")

收益

{{ shared.numberWithCommas( totalObservations ) }}

因此在这种情况下您不能只使用ImportXML()。

但是，一切并没有丢失。我用F12打开网络监视器，看到页面正在对此URL发出Web请求：

https://api.inaturalist.org/v1/observations/observers?verifiable=any&quality_grade=needs_id&user_id=ericthuranira&locale=en-US

以获取观察数据，该数据似乎为JSON格式。例如。（为便于阅读而格式化）

{
  "total_results": 1,"page": 1,"per_page": 500,"results": [
    {
      "user_id": 1265521,"observation_count": 121,"species_count": 42,"user": {
        "id": 1265521,"login": "ericthuranira","spam": false,"suspended": false,"created_at": "2018-10-09T11:43:22+00:00","login_autocomplete": "ericthuranira","login_exact": "ericthuranira","name": "Eric Thuranira","name_autocomplete": "Eric Thuranira","orcid": null,"icon": "https://static.inaturalist.org/attachments/users/icons/1265521/thumb.jpeg?1580369132","observations_count": 237,"identifications_count": 203,"journal_posts_count": 0,"activity_count": 440,"species_count": 150,"universal_search_rank": 237,"roles": [],"site_id": 1,"icon_url": "https://static.inaturalist.org/attachments/users/icons/1265521/medium.jpeg?1580369132"
      }
    }
  ]
}

这不是XML格式，因此您必须使用JSON解析器来做到这一点。幸运的是，有人为Google表格创建了一个！通过执行以下操作，您可以轻松地自己获得此信息：

将here中的代码粘贴到脚本编辑器中（工具>脚本编辑器），并将其另存为ImportJSON。这为您提供了JSON解析器。
采用我上面为观察者提到的“ api” URL，使用此公式（假设URL位于A3中）
```
=ImportJSON(A3,"/results/observation_count","noHeaders")
```

这将为您提供所需的电话号码。

google-sheets-importxml web-scraping