使用urlopen时出现HTTP 406不可接受的客户端错误

问题描述

我正在使用 urllib.request.urlopen 查询URL http://dblp.org/db/conf/lak/index。由于某些原因,我无法使用python模块 urllib 访问该站点,因为我收到以下HTTP状态代码错误

HTTPError:HTTP错误406:不可接受

这是我用来发出此请求的代码

    public static byte[] decrypt(byte[] cryptoBytes,byte[] aesSymKey)
        throws NoSuchAlgorithmException,NoSuchPaddingException,InvalidKeyException,InvalidAlgorithmParameterException,IllegalBlockSizeException,BadPaddingException {
    // https://github.com/onelogin/java-saml/issues/23
    String cipherMethod = "AES/CBC/ISO10126Padding"; // This should be derived from Cryptic Saml

    AlgorithmParameterSpec iv = new IvParameterSpec(cryptoBytes,16);
    
    // Strip off the the first 16 bytes because those are the IV
    byte[] cipherBlock = Arrays.copyOfRange(cryptoBytes,16,cryptoBytes.length);
            
    // Create a secret key based on symKey
    SecretKeySpec secretSauce = new SecretKeySpec(aesSymKey,"AES");

    // Now we have all the ingredients to decrypt
    Cipher cipher = Cipher.getInstance(cipherMethod);
    cipher.init(Cipher.DECRYPT_MODE,secretSauce,iv);

    // Do the decryption
    byte[] decrypedBytes = cipher.doFinal(cipherBlock);
    return decrypedBytes;
}

我不确定导致此错误的原因,因此我需要协助来解决错误

以下是与此错误相关的堆栈跟踪:

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = 'http://dblp.org/db'
html = urlopen(url).read()
soup = BeautifulSoup(html)
print(soup.prettify())

解决方法

我正在调查 406错误代码,当服务器无法使用请求中指定的accept-header响应时,就会发生这种错误。如果我可以让 urlopen 正常工作,我也将发布该答案。

使用 Python请求

时没有出现此错误
manage_pages

下面的答案使用 urlopen ,它不会产生406错误。

import requests
from bs4 import BeautifulSoup

user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Firefox/78.0'
raw_html = requests.get('http://dblp.org/db/conf/lak/index')
soup = BeautifulSoup(raw_html.content,'html.parser')
print(soup.prettify())