由于密钥被复制,每个 <item> 都会跳过一个 <ht:news_item> 标签!如何避免?

问题描述

googletrend daily serach rss feed of google trend 我正在尝试获取 google 趋势搜索,如我想要获取第一张图片所示 + 两个 。在我的代码中,我只能获取并且接下来只有一个 被跳过。 链接

网站链接https://trends.google.com/trends/trendingsearches/daily?geo=IN RSS 提要 - https://trends.google.com/trends/trendingsearches/daily/rss?geo=IN

import requests
from bs4 import BeautifulSoup
import csv
import pandas as pd

url = "https://trends.google.co.in/trends/trendingsearches/daily/RSS?geo=IN"
resp = requests.get(url)
soup = BeautifulSoup(resp.content,'xml')

items = soup.findAll('item')
news_items = []  

for item in items:
    news_item = {}
    news_item['title'] = item.title.text
    news_item['approx_traffic']=item.approx_traffic.text
    news_item['description'] = item.description.text
    news_item['link'] = item.link.text
    news_item['pubDate']=item.pubDate.text
    news_item['picture'] = item.picture.text
    news_item['picture_source'] = item.picture_source.text
    news_item['news_item_title']=item.news_item_title.text
    news_item['news_item_snippet'] = item.news_item_snippet.text
    news_item['news_item_url'] = item.news_item_url.text
    news_item['news_item_source'] = item.news_item_source.text
    news_item['news_item_title']=item.news_item_title.text
    news_item['news_item_snippet'] = item.news_item_snippet.text
    news_item['news_item_url'] = item.news_item_url.text
    news_item['news_item_source'] = item.news_item_source.text
    news_items.append(news_item)
print(news_items)

print(len(news_item))

df = pd.DataFrame(news_items)

df.to_csv('my_csv.csv',columns=["title","approx_traffic","description","link","pubDate","picture","picture_source","news_item_title","news_item_snippet","news_item_url","news_item_source","news_item_source"])

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)