问题描述
我有一个包含 100 万个 URL 的文件 (urls.csv)。每一行都是一个新的网址,如:
我想获取每个 url 末尾的 json 文件,并将其保存为每个 url 的单独 json 文件,文件名按 1,2,3,n 的顺序...
这是我目前所拥有的:
import requests
import csv
url = []
with open('urls.csv') as csvfile:
csvReader = csv.reader(csvfile)
for row in csvReader:
url.append(row[0])
headers = {'Accept': 'application/json'}
response = requests.get(url,headers=headers)
with open('outputfile.json','wb') as outf:
outf.write(response.content)
我应该如何解决这个问题?
解决方法
试试这个:
import requests
import csv
urls = []
with open('urls.csv') as csvfile:
csvReader = csv.reader(csvfile)
for row in csvReader:
urls.append(row[0])
headers = {'Accept': 'application/json'}
for url in urls:
response = requests.get(url,headers=headers)
filename = url.split('/')[-1]
with open(f'{filename}.json','wb') as outf:
outf.write(response.content)
假设您的第 3 个网址是 https://example.com/3
,代码将保存一个名为 3.json
的文件以用于相应的响应。