Python请求-拒绝连接

问题描述

我正在尝试使用请求python模块从csv文件中的各种URL获取status_code。 它适用于某些网站,但是对于大多数网站,它显示“拒绝连接” ,即使我通过浏览器访问这些网站也可以正常加载。

代码如下:

import pandas as pd 
import requests 
from requests.adapters import HTTPAdapter
from fake_useragent import UserAgent
import time
import urllib3

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

df = pd.read_csv('Websites.csv')
output_data = pd.DataFrame(columns=['url','status'])
number_urls = df.shape[0]

i = 0

for url in df['urls']:

    session = requests.Session()
    adapter = HTTPAdapter(max_retries=3)
    adapter.max_retries.respect_retry_after_header = False
    session.mount('http://',adapter)
    session.mount('https://',adapter)

    print(url)

    ua = UserAgent()
    header = {'User-Agent':str(ua.chrome)}
    
    try:
        # Status
        start = time.time()
        response = session.get(url,headers=header,verify=False,timeout=0.5)
        request_time = time.time() - start
        info = "Request completed in {0:.0f}ms".format(request_time)
        print(info)
        status = response.status_code
        if (status == 200):
            status = "Connection Successful"
        if (status == 404):
            status = "404 Error"
        if (status == 403):
            status = "403 Error"
        if (status == 503):
            status = "503 Error"
        print(status)

        output_data.loc[i] = [df.iloc[i,0],status]

        i += 1

    except requests.exceptions.Timeout:
        status = "Connection Timed Out"
        print(status)
        request_time = time.time() - start
        info = "TimeOut in {0:.0f}ms".format(request_time)
        print(info)

        output_data.loc[i] = [df.iloc[i,status]
        i += 1

    except requests.exceptions.ConnectionError:
        status = "Connection Refused"
        print(status)
        request_time = time.time() - start
        info = "Connection Error in {0:.0f}ms".format(request_time)
        print(info)

        output_data.loc[i] = [df.iloc[i,status]
        i += 1

output_data.to_csv('dead_blocked2.csv',index=False)
print('CSV file created!')

以下是一个显示“拒绝连接”的网站的示例,即使该网站有效:https://www.dytt8.net

我尝试使用以下代码段使用不同的TLS版本并更新我的会话,但仍然无法正常工作:

class MyAdapter(HTTPAdapter):
def init_poolmanager(self,connections,maxsize,block=False):
    self.poolmanager = PoolManager(num_pools=connections,maxsize=maxsize,block=block,ssl_version=ssl.PROTOCOL_TLSv1)

有人可以帮忙吗?

谢谢!

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...