问题描述
我正在尝试从json文件youtube-dl中提取信息,并将其中的一些信息grep到.txt文件中。
下载视频时youtube-dl的输出示例。
[info] Writing video description to: /Users/ACCOUNT/Downloads/Rick Astley - Never Gonna Give You Up (Video).description
[info] Writing video description Metadata as JSON to: /Users/ACCOUNT/Downloads/Rick Astley - Never Gonna Give You Up (Video).info.json
我的想法
- Grep .json和.description文件路径,以在以后的grep命令中使用。
- 运行以下脚本的工作版本,并将新文本添加到.description文件中的描述文本上方。
- (将.description重命名为.txt)
我更喜欢这种方法,因为youtube-dl仅需要运行一次。
如果在Mac和Linux上还有其他通用命令可以像grep一样简单,那么我认为使用它们代替grep没问题。
问题
- 如何grep文件路径并在脚本示例中下文所述的其他命令中使用它?
- 如何运行以下脚本,但如何在该文本文件中的当前描述文本上方添加所有信息?
- 当它从json文件中获取信息时,它也会在前后获得“”。因此,视频名称变为:
"VIDEO NAME"
,但只希望它VIDEO NAME
。 - 如何从json文件中复制标签?标签在.json文件:
"tags": ["music","video","classic"]
中看起来像这样。想要获得"music","classic"
。
脚本示例
txtfile="$GREP_DESCRIPTION_FROM_YOUTUBE-DL_OUTPUT"
jsonfile="$GREP_JSON_FROM_YOUTUBE-DL_OUTPUT"
echo TITLE >> $txtfile
grep -o '"title": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $txtfile
echo \ >> $txtfile
echo CHANNEL >> $txtfile
grep -o '"uploader": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $txtfile
echo \ >> $txtfile
echo CHANNEL URL >> $txtfile
grep -o '"uploader_url": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $txtfile
echo \ >> $txtfile
echo UPLOAD DATE >> $txtfile
grep -o '"upload_date": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $txtfile
echo \ >> $txtfile
echo TAGS >> $txtfile
grep -o '"tags": *"[^"]*"' $jsonfile | grep -o '"[^"]*"$' >> $txtfile
echo \ >> $txtfile
echo URL >> $txtfile
echo $url >> $txtfile
echo \ >> $txtfile
echo DESCRIPTION >> $txtfile
解决方法
谢谢巴默!这回答了我四个问题中的三个。
剩下来的,我想不通的是如何从youtube-dl输出中获取json文件的位置,如何使其在脚本中工作以及如何在同一目录中以.txt结尾的.txt文件。 / p>
类似这样的东西:
- Grep
[info] Writing video description metadata as JSON to:
之后的所有内容,即/Users/ACCOUNT/Downloads/Rick Astley - Never Gonna Give You Up (Video).info.json
- 设为
$jsonfile
- 从点1获得相同的输出,将扩展名(最后一个.DOT之后的所有内容)替换为.txt,使
$txtfile
使用jq更新脚本
#! /bin/bash
txtfile="textfile.txt"
jsonfile="jsonfile.json"
echo - TITLE - >> $txtfile
jq -r '.title' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - CHANNEL - >> $txtfile
jq -r '.uploader' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - CHANNEL URL - >> $txtfile
jq -r '.uploader_url' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - UPLOAD DATE - >> $txtfile
jq -r '.upload_date' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - URL - >> $txtfile
jq -r '.webpage_url' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - TAGS - >> $txtfile
jq -r -c '.tags' $jsonfile >> $txtfile
echo \ >> $txtfile | echo \ >> $txtfile
echo - DESCRIPTION - >> $txtfile
jq -r '.description' $jsonfile >> $txtfile
,
youtube-dl --help | grep "dump-json"
-j,--dump-json Simulate,quiet but print JSON information.
使用此选项,完全不需要下载视频。只需将youtube-dl
的输出传递到适当的JSON解析器即可。我会推荐xidel。
youtube-dl -j https://www.youtube.com/watch?v=dQw4w9WgXcQ | xidel - -se '
$json/(
"- TITLE -",title,"","- CHANNEL -",uploader,"- CHANNEL URL -",uploader_url,"- UPLOAD DATE -",upload_date,"- URL -",webpage_url,"- TAGS -",substring-before(
substring(serialize-json(tags),2),"]"
),"- DESCRIPTION -",description
)
'
如果您已经下载了视频和JSON(我假设使用--write-info-json
,则可以使用--get-filename
检索文件名:
youtube-dl --get-filename https://www.youtube.com/watch?v=dQw4w9WgXcQ
Rick Astley - Never Gonna Give You Up (Video)-dQw4w9WgXcQ.mp4
jsonfile=$(youtube-dl --get-filename https://www.youtube.com/watch?v=dQw4w9WgXcQ)
xidel -s "${jsonfile/.mp4/.info}.json" -e '
$json/(
[...]
)
' > "${jsonfile/.mp4/.info}.txt"
命令输出或“ 里克·阿斯特利-永远不会放弃你(视频)-dQw4w9WgXcQ.info.txt ”的内容:
- TITLE -
Rick Astley - Never Gonna Give You Up (Video)
- CHANNEL -
RickAstleyVEVO
- CHANNEL URL -
http://www.youtube.com/user/RickAstleyVEVO
- UPLOAD DATE -
20091024
- URL -
https://www.youtube.com/watch?v=dQw4w9WgXcQ
- TAGS -
"the boys soundtrack","the boys amazon prime","Never gonna give you up the boys","RickAstleyvevo","vevo","official","Rick Roll","video","music video","Rick Astley album","rick astley official","single","album","together forever","Never Gonna Give You Up","Whenever You Need Somebody","pop","rickrolled","WRECK-IT RALPH 2","Fortnite song Fortnite item shop Fortnite time shop today Fortnite montage","Fortnite event","Fortnite dance","fortnite never gonna give you up"
- DESCRIPTION -
Rick Astley's official music video for "Never Gonna Give You Up" Listen to Rick Astley: https://RickAstley.lnk.to/_listenYD Subscribe to the official Rick As...
实际上,如果您只需要这些信息,就不需要youtube-dl
。解析html-source就足够了。
xidel -s https://www.youtube.com/watch?v=dQw4w9WgXcQ -e '
"- TITLE -",//meta[@itemprop="name"]/@content,//span[@itemprop="author"]/link/@content,//span[@itemprop="author"]/link/@href,//meta[@itemprop="datePublished"]/@content,//meta[@property="og:url"]/@content,join(
//meta[@property="og:video:tag"]/outer-html() ! substring-before(
substring-after(.,"content="),">"
),","
),//meta[@itemprop="description"]/@content
'
html源还具有一个庞大的JSON,其中包含您需要的所有信息。提取起来有点困难,但是可以做到。与其他两种解决方案相比,此“源”没有截断的视频描述:
xidel -s https://www.youtube.com/watch?v=dQw4w9WgXcQ -e '
let $json:=json(
//script/extract(.,"ytplayer.config = (.+?\});",1)[.]
)/args,$a:=json($json/player_response)/videoDetails,$b:=json($json/player_response)/microformat
return (
"- TITLE -",$a/title,$a/author,$b//ownerProfileUrl,$b//publishDate,$json/loaderUrl,substring-before(
substring(serialize-json($a/keywords),$a/shortDescription
)
'
,
下面讨论的已解决问题。
通过在脚本末尾添加两个“”来解决该问题
...
' --printed-json-format=compact >> "$textfile"
谢谢雷诺!
谢谢。现在已经尝试使其工作。它给了我一些错误,并开始了一些故障排除。还是没有运气。
此测试以查看是否有效。 folder
和url
出现在脚本的前面,并且是临时的。
folder=/Users/ACCOUNT/Downloads/ytdl/
url=https://www.youtube.com/watch?v=dQw4w9WgXcQ
textfile=$(youtube-dl --get-filename -o $folder'%(title)s/%(title)s.txt' $url)
$textfile
输出为:
-bash: /Users/ACCOUNT/Downloads/ytdl/Rick: No such file or directory
如果我创建该文件夹,则输出为
-bash: /Users/ACCOUNT/Downloads/ytdl/Rick: is a directory
但是如果我完全按照我的想法测试脚本
youtube-dl --get-filename -o /Users/ACCOUNT/Downloads/ytdl/'%(title)s/%(title)s.txt' https://www.youtube.com/watch?v=dQw4w9WgXcQ
输出为:
/Users/ACCOUNT/Downloads/ytdl/Rick Astley - Never Gonna Give You Up (Video)/Rick Astley - Never Gonna Give You Up (Video).txt
它应该是什么样子。我在做什么错了?
这是xidel脚本以及$url
和>> $textfile
的更改方式。我使用此脚本是因为它具有完整的描述。
xidel -s "$url" -e '
let $json:=json(
//script/extract(.,$a/keywords,$a/shortDescription
)
' --printed-json-format=compact >> $textfile