问题描述
我在EKS上有一个集群,上面运行着一些API,这是用于部署它们的yaml文件:
apiVersion: v1
kind: Service
Metadata:
name: <api-name>
spec:
type: ClusterIP
selector:
app: <api-name>
ports:
- protocol: TCP
port: 80
targetPort: <container-port>
---
apiVersion: apps/v1
kind: Deployment
Metadata:
name: <api-name>
spec:
replicas: 1
selector:
matchLabels:
app: <api-name>
template:
Metadata:
labels:
app: <api-name>
spec:
containers:
- name: <api-name>
image: <ecr-image-url>
ports:
- containerPort: <container-port>
name: <api-name>
env:
- name: ENVIRONMENT
value: <environment>
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
Metadata:
name: <api-name>
annotations:
kubernetes.io/ingress.class: "Nginx"
spec:
rules:
- host: <app-name>.<dns>
http:
paths:
- backend:
serviceName: <api-name>
servicePort: 80
路由工作正常(由nginx-ingress创建的网络负载平衡器),但是当我尝试从一个吊舱向另一个吊舱发出请求时,我收到:
[2020-08-14 11:49:42,214] ERROR in app: Exception on /services [GET]
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 2447,in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1952,in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.8/site-packages/flask_cors/extension.py",line 161,in wrapped_function
return cors_after_request(app.make_response(f(*args,**kwargs)))
File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1821,in handle_user_exception
reraise(exc_type,exc_value,tb)
File "/usr/local/lib/python3.8/site-packages/flask/_compat.py",line 39,in reraise
raise value
File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1948,in full_dispatch_request
rv = self.preprocess_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 2242,in preprocess_request
rv = func()
File "/app/app/views/__init__.py",line 12,in before_rest_callback
validate_request(token)
File "/app/app/utils.py",line 29,in validate_request
response = httpx.post(url,headers=headers,timeout=60)
File "/usr/local/lib/python3.8/site-packages/httpx/_api.py",line 269,in post
return request(
File "/usr/local/lib/python3.8/site-packages/httpx/_api.py",line 86,in request
return client.request(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 640,in request
return self.send(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 670,in send
response = self._send_handling_redirects(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 699,in _send_handling_redirects
response = self._send_handling_auth(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 736,in _send_handling_auth
response = self._send_single_request(request,timeout)
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 759,in _send_single_request
(
File "/usr/local/lib/python3.8/contextlib.py",line 131,in __exit__
self.gen.throw(type,value,traceback)
File "/usr/local/lib/python3.8/site-packages/httpx/_exceptions.py",line 359,in map_exceptions
raise mapped_exc(message,**kwargs) from None # type: ignore
httpx._exceptions.ReadError: Server disconnected while attempting read
豆荚之间我没有任何联系。该请求不会到达在同一群集,同一节点上运行的另一个应用程序。 Nginx Ingress是在official documentation之后安装的。
关于可能是什么原因的任何线索?我丢弃了与部署相关的所有内容(在这种情况下为API或gunicorn)。似乎与集群和/或Nginx入口有关。尝试搜索有关内容,发现与“空闲超时”相关的内容,但这不适用于网络负载平衡器。
解决方法
使用http://<app-name>
(应用程序名称为服务名称)解决了问题。参考:How to implement Kubernetes POD to POD Communication?