问题描述
以这个Python 2.7脚本为例,它使用了多处理模块:
# Local test
import urllib2
import shlex
import requests
import json
import threading
import os
import logging
from multiprocessing import Process,Queue
from threading import current_thread
sessions = {}
logging.basicConfig(filename='/tmp/python.log',level=logging.DEBUG)
def worker(session,queue):
logging.exception('parent process: ' + str(os.getppid()) + ',process id: ' + str(os.getpid()) + ' -- ' + str(session.verify))
url = 'http://127.0.0.1:8487/test'
response = session.get(url,verify=True,timeout=5).json()
queue.put(response)
return response
def doWork():
global sessions
try:
thread = threading.current_thread()
if not id(thread) in sessions:
sessions[id(thread)] = requests.Session()
session = sessions[id(thread)]
session.verify = 'new session - ' + current_thread().name
else:
session = sessions[id(thread)]
session.verify = 'reuse session - ' + current_thread().name
queue = Queue()
p = Process(target=worker,args=(session,queue))
p.start()
p.join()
return queue.get()
except Exception as e:
logging.exception(e)
return "error"
请不要担心“会话注册表”。这对于更大的环境是必要的,但对我的工作不会有任何影响。我想展示的是,我实际上是在分支过程中重用了相同的会话对象。所以我正在像这样运行此脚本:
python
Python 2.7.5 (default,Aug 7 2019,00:51:29)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2
Type "help","copyright","credits" or "license" for more information.
>>> import test
>>> test.doWork()
{u'name': 3}
>>> test.doWork()
{u'name': 3}
>>> test.doWork()
{u'name': 3}
>>>
我的python.log显示如下:
ERROR:root:parent process: 10092,process id: 10240 -- new session - MainThread
None
INFO:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:"GET /test HTTP/1.1" 200 10
ERROR:root:parent process: 10092,process id: 10253 -- reuse session - MainThread
None
INFO:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:"GET /test HTTP/1.1" 200 10
ERROR:root:parent process: 10092,process id: 10261 -- reuse session - MainThread
None
INFO:urllib3.connectionpool:Starting new HTTP connection (1): 127.0.0.1
DEBUG:urllib3.connectionpool:"GET /test HTTP/1.1" 200 10
为什么会话是一个相同的会话对象,为什么它开始一个新的HTTP连接?如果我更改代码以直接调用worker
,而无需进行多处理,它将按预期工作,并且连接将被重用。
仅供参考,我使用的是报告以下内容的模拟HTTP服务器(mock-server.com):
2020-09-25 08:34:55 5.11.1 INFO 1080 returning response:
{
"body" : "{\"name\":3}","delay" : {
"timeUnit" : "MILLISECONDS","value" : 30
},"connectionoptions" : {
"closeSocket" : false
}
}
for request:
{
"method" : "GET","path" : "/test","headers" : {
"Host" : [ "127.0.0.1:8487" ],"Connection" : [ "keep-alive" ],"Accept-Encoding" : [ "gzip,deflate" ],"Accept" : [ "*/*" ],"User-Agent" : [ "python-requests/2.6.0 cpython/2.7.5 Linux/3.10.0-1062.18.1.el7.x86_64" ],"content-length" : [ "0" ]
},"keepAlive" : true,"secure" : false
}
for action:
{
"body" : "{\"name\":3}","connectionoptions" : {
"closeSocket" : false
}
}
服务器正在回复keep-alive:
curl -v 127.0.0.1:8487/test
* About to connect() to 127.0.0.1 port 8487 (#0)
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 8487 (#0)
> GET /test HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 127.0.0.1:8487
> Accept: */*
>
< HTTP/1.1 200 OK
< connection: keep-alive
< content-length: 10
<
* Connection #0 to host 127.0.0.1 left intact
{"name":3}
解决方法
request.Session()具有一个带有10个线程的内置线程池(通过HTTPAdapter),您可能正在对该内部池进行初始化。首先,在使用中,您可能不需要包装池,因为会话已经具有内置池。或者,将内部池限制为1个线程,以查看是否有帮助
session = requests.session()
adapter = requests.adapters.HTTPAdapter(pool_connections=1)
session.mount("http://",adapter)