我可以用这样的代码从任何搜索引擎下载图像吗?

问题描述

我尝试将图像从bing下载到目录,但是由于某些原因,该代码仅执行并且什么也没有给我..甚至没有错误..我也使用了用户代理HTTP ..但它似乎仍然没有正在工作。我该怎么办?

from bs4 import BeautifulSoup
import requests
from PIL import Image
from io import BytesIO

url = 'https://www.bing.com/search'
search = input("Search for: ")
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 
Firefox/80.0'}
params = {"q": search}
r = requests.get(url,headers=headers,params=params)

soup = BeautifulSoup(r.text,"html.parser")
links = soup.findAll("a",{"class": "thumb"})

for item in links:
     img_obj = requests.get(item.attrs["href"])
     print("Getting",item.attrs["href"])
     title = item.attrs["href"].split("/")[-1]
     img = Image.open(BytesIO(img_obj.content))
     img.save("./scraped_images/" + title,img.format)

解决方法

要获取所有图像,您需要在链接中添加/images。这是修改代码的示例:

from bs4 import BeautifulSoup
from PIL import Image
from io import BytesIO
import requests
import json

search = input("Search for: ")

url = "https://www.bing.com/images/search"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0"
}
params = {"q": search,"form": "HDRSC2","first": "1","scenario": "ImageBasicHover"}
r = requests.get(url,headers=headers,params=params)

soup = BeautifulSoup(r.text,"html.parser")
links = soup.find_all("div",{"class": "img_cont hoff"})

for data in soup.find_all("a",{"class": "iusc"}):
    json_data = json.loads(data["m"])
    img_link = json_data["murl"]
    img_object = requests.get(img_link,headers=headers)
    title = img_link.split("/")[-1]

    print("Getting: ",img_link)
    print("Title: ",title + "\n")

    img = Image.open(BytesIO(img_object.content))
    img.save("./scraped_images/" + title)