吊舱之间的请求失败

问题描述

我在EKS上有一个集群,上面运行着一些API,这是用于部署它们的yaml文件

apiVersion: v1
kind: Service
Metadata:
  name: <api-name>
spec:
  type: ClusterIP
  selector:
    app: <api-name>
  ports:
    - protocol: TCP
      port: 80
      targetPort: <container-port>
---
apiVersion: apps/v1
kind: Deployment
Metadata:
  name: <api-name>
spec:
  replicas: 1
  selector:
    matchLabels:
      app: <api-name>
  template:
    Metadata:
      labels:
        app: <api-name>
    spec:
      containers:
      - name: <api-name>
        image: <ecr-image-url>
        ports:
        - containerPort: <container-port>
          name: <api-name>
        env:
          - name: ENVIRONMENT
            value: <environment>
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
Metadata:
  name: <api-name>
  annotations:
    kubernetes.io/ingress.class: "Nginx"
spec:
  rules:
  - host: <app-name>.<dns>
    http:
      paths:
      -  backend:
          serviceName: <api-name>
          servicePort: 80

路由工作正常(由nginx-ingress创建的网络负载平衡器),但是当我尝试从一个吊舱向另一个吊舱发出请求时,我收到:

[2020-08-14 11:49:42,214] ERROR in app: Exception on /services [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 2447,in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1952,in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/site-packages/flask_cors/extension.py",line 161,in wrapped_function
    return cors_after_request(app.make_response(f(*args,**kwargs)))
  File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1821,in handle_user_exception
    reraise(exc_type,exc_value,tb)
  File "/usr/local/lib/python3.8/site-packages/flask/_compat.py",line 39,in reraise
    raise value
  File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 1948,in full_dispatch_request
    rv = self.preprocess_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py",line 2242,in preprocess_request
    rv = func()
  File "/app/app/views/__init__.py",line 12,in before_rest_callback
    validate_request(token)
  File "/app/app/utils.py",line 29,in validate_request
    response = httpx.post(url,headers=headers,timeout=60)
  File "/usr/local/lib/python3.8/site-packages/httpx/_api.py",line 269,in post
    return request(
  File "/usr/local/lib/python3.8/site-packages/httpx/_api.py",line 86,in request
    return client.request(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 640,in request
    return self.send(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 670,in send
    response = self._send_handling_redirects(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 699,in _send_handling_redirects
    response = self._send_handling_auth(
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 736,in _send_handling_auth
    response = self._send_single_request(request,timeout)
  File "/usr/local/lib/python3.8/site-packages/httpx/_client.py",line 759,in _send_single_request
    (
  File "/usr/local/lib/python3.8/contextlib.py",line 131,in __exit__
    self.gen.throw(type,value,traceback)
  File "/usr/local/lib/python3.8/site-packages/httpx/_exceptions.py",line 359,in map_exceptions
    raise mapped_exc(message,**kwargs) from None  # type: ignore
httpx._exceptions.ReadError: Server disconnected while attempting read

豆荚之间我没有任何联系。该请求不会到达在同一群集,同一节点上运行的另一个应用程序。 Nginx Ingress是在official documentation之后安装的。

关于可能是什么原因的任何线索?我丢弃了与部署相关的所有内容在这种情况下为API或gunicorn)。似乎与集群和/或Nginx入口有关。尝试搜索有关内容,发现与“空闲超时”相关的内容,但这不适用于网络负载平衡器。

解决方法

使用http://<app-name>(应用程序名称为服务名称)解决了问题。参考:How to implement Kubernetes POD to POD Communication?