Python:将 YouTube 分析数据来自 API放入数据帧

问题描述

亲爱的开发者朋友:

希望你一切都好,

我从 Python 语言开始(我学习了一些课程)并且我决定提高我的技能来尝试分析一个 Youtube 频道(Amixem,一个 fench youtuber)。

因此,我尝试按照 this tutorial 将 YouTube 数据(来自 API)放入 DataFrame(以便在第二次进行一些统计)。

我不明白第 35 行是哪个变量(下面提供的代码)而不是这个变量 video_ids:

for i in range(0,len(video_ids),40):

我有

NameError: name 'video_ids' is not defined

我将“video_ids”替换为“videos”(Amixem 的总视频数 - 580 个视频 - 以停止计数):

for i in range(0,len(videos),40):

我有

TypeError: sequence item 0: expected str instance,dict found

所以我尝试用以下方法解决这个问题:

for i in range(0,40):
    res = (youtube).videos().list(id=",".join(str(v) for v in videos),part="statistics").execute()

所以我有

googleapiclient.errors.HttpError: <HttpError 400 when requesting https://youtube.googleapis.com/youtube/v3/videos returned "The request specifies an invalid filter parameter."

我有点失望。 你能帮我吗?

我的代码如下:

#Importing required packages
from googleapiclient.discovery import build
import pandas as pd

#Creating Objects
youTubeApiKey="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
youtube= build("youtube","v3",developerKey=youTubeApiKey)
channelId = "UCgvqvBoSHB1ctlyyhoHrGwQ"

#Calling Data from API
statchanneldata=youtube.channels().list(part="statistics",id=channelId).execute()
statchannel=statchanneldata["items"][0]["statistics"]

#Getting Snippet Data
snippetdata=youtube.channels().list(part="snippet",id=channelId).execute()

#Getting Details of all videos
contentdata=youtube.channels().list(id=channelId,part="contentDetails").execute()
playlist_id = contentdata["items"][0]["contentDetails"]["relatedplaylists"]["uploads"]
videos = [ ]
next_page_token = None
while 1:
    res = youtube.playlistItems().list(playlistId=playlist_id,part="snippet",maxResults=10,pagetoken=next_page_token).execute()

    videos += res["items"]
    next_page_token = res.get("nextPagetoken")
    if next_page_token is None:
        break

#Getting the statistics of each video
stats = []
for i in range(0,".join(video_ids[i:i+40]),part="statistics").execute()
    stats += res["items"]

#Collecting All information in a List & creating a dataframe
title=[ ]
liked=[ ]
disliked=[ ]
views=[ ]
url=[ ]
comment=[ ]

for i in range(len(videos)):
    title.append((videos[i])["snippet"]["title"])
    url.append("https://www.youtube.com/watch?v="+(videos[i])["snippet"]["resourceId"]["videoId"])
    liked.append(int((stats[i])["statistics"]["likeCount"]))
    disliked.append(int((stats[i])["statistics"]["dislikeCount"]))
    views.append(int((stats[i])["statistics"]["viewCount"]))
    comment.append(int((stats[i])["statistics"]["commentCount"]))
    data={"title":title,"url":url,"liked":liked,"disliked":disliked,"views":views,"comment":comment}
    df=pd.DataFrame(data)

df

谢谢, 奥雷利安。

解决方法

替换代码:

for i in range(0,len(videos),40):
    res = (youtube).videos().list(id=",".join(str(v) for v in videos),part="statistics").execute() 

与:

for i in range(0,len(videos)):
    res = (youtube).videos().list(id=videos[i]['snippet']['resourceId']['videoId']),part="statistics").execute()
,

好的,我找到了解决方案:

你只需要定义 video_ids :

video_ids = list(map(lambda x:x['snippet']['resourceId']['videoId'],视频))

所以,我们是:

#获取每个视频的统计信息 stats = [] video_ids = list(map(lambda x:x['snippet']['resourceId']['videoId'],videos)) for 我在范围内(0,len(video_ids),40): res = (youtube).videos().list(id=",".join(video_ids[i:i+40]),part="statistics").execute() stats += res["items"]