问题描述
亲爱的开发者朋友:
希望你一切都好,
我从 Python 语言开始(我学习了一些课程)并且我决定提高我的技能来尝试分析一个 Youtube 频道(Amixem,一个 fench youtuber)。
因此,我尝试按照 this tutorial 将 YouTube 数据(来自 API)放入 DataFrame(以便在第二次进行一些统计)。
我不明白第 35 行是哪个变量(下面提供的代码)而不是这个变量 video_ids:
for i in range(0,len(video_ids),40):
我有:
NameError: name 'video_ids' is not defined
我将“video_ids”替换为“videos”(Amixem 的总视频数 - 580 个视频 - 以停止计数):
for i in range(0,len(videos),40):
我有:
TypeError: sequence item 0: expected str instance,dict found
for i in range(0,40):
res = (youtube).videos().list(id=",".join(str(v) for v in videos),part="statistics").execute()
所以我有:
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://youtube.googleapis.com/youtube/v3/videos returned "The request specifies an invalid filter parameter."
我有点失望。 你能帮我吗?
我的代码如下:
#Importing required packages
from googleapiclient.discovery import build
import pandas as pd
#Creating Objects
youTubeApiKey="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
youtube= build("youtube","v3",developerKey=youTubeApiKey)
channelId = "UCgvqvBoSHB1ctlyyhoHrGwQ"
#Calling Data from API
statchanneldata=youtube.channels().list(part="statistics",id=channelId).execute()
statchannel=statchanneldata["items"][0]["statistics"]
#Getting Snippet Data
snippetdata=youtube.channels().list(part="snippet",id=channelId).execute()
#Getting Details of all videos
contentdata=youtube.channels().list(id=channelId,part="contentDetails").execute()
playlist_id = contentdata["items"][0]["contentDetails"]["relatedplaylists"]["uploads"]
videos = [ ]
next_page_token = None
while 1:
res = youtube.playlistItems().list(playlistId=playlist_id,part="snippet",maxResults=10,pagetoken=next_page_token).execute()
videos += res["items"]
next_page_token = res.get("nextPagetoken")
if next_page_token is None:
break
#Getting the statistics of each video
stats = []
for i in range(0,".join(video_ids[i:i+40]),part="statistics").execute()
stats += res["items"]
#Collecting All information in a List & creating a dataframe
title=[ ]
liked=[ ]
disliked=[ ]
views=[ ]
url=[ ]
comment=[ ]
for i in range(len(videos)):
title.append((videos[i])["snippet"]["title"])
url.append("https://www.youtube.com/watch?v="+(videos[i])["snippet"]["resourceId"]["videoId"])
liked.append(int((stats[i])["statistics"]["likeCount"]))
disliked.append(int((stats[i])["statistics"]["dislikeCount"]))
views.append(int((stats[i])["statistics"]["viewCount"]))
comment.append(int((stats[i])["statistics"]["commentCount"]))
data={"title":title,"url":url,"liked":liked,"disliked":disliked,"views":views,"comment":comment}
df=pd.DataFrame(data)
df
谢谢, 奥雷利安。
解决方法
替换代码:
for i in range(0,len(videos),40):
res = (youtube).videos().list(id=",".join(str(v) for v in videos),part="statistics").execute()
与:
for i in range(0,len(videos)):
res = (youtube).videos().list(id=videos[i]['snippet']['resourceId']['videoId']),part="statistics").execute()
,
好的,我找到了解决方案:
你只需要定义 video_ids :
video_ids = list(map(lambda x:x['snippet']['resourceId']['videoId'],视频))
所以,我们是:
#获取每个视频的统计信息 stats = [] video_ids = list(map(lambda x:x['snippet']['resourceId']['videoId'],videos)) for 我在范围内(0,len(video_ids),40): res = (youtube).videos().list(id=",".join(video_ids[i:i+40]),part="statistics").execute() stats += res["items"]