升级到 nginx 3.11.1

问题描述

我有一个在 AWS 中运行的 Kubernetes 集群,我正在升级各种组件。在内部,我们使用的是 Nginx,它目前位于 v1.1.1 图表的 nginx-ingress(从 old stable 提供),具有以下配置:

controller:
  publishService:
    enabled: "true"
  replicaCount: 3
  service:
    annotations:
      external-dns.alpha.kubernetes.io/hostname: '*.MY.TOP.LEVEL.DOMAIN'
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert: [SNIP]
      service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
    targetPorts:
      http: http
      https: http

我的服务的入口资源看起来像...

apiVersion: extensions/v1beta1
kind: Ingress
Metadata:
  annotations:
    kubernetes.io/ingress.class: Nginx
    Nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
  [SNIP]
spec:
  rules:
  - host: MY-SERVICE.MY.TOP.LEVEL.DOMAIN
    http:
      paths:
      - backend:
          serviceName: MY-SERVICE
          servicePort: 80
        path: /
status:
  loadBalancer:
    ingress:
    - hostname: [SNIP]

但是,当我升级v3.11.1 图表的 ingress-Nginx(从 the k8s museum 提供)时,此配置工作正常。

使用未修改的配置,卷曲到 HTTPS 方案会重定向回自身:

curl -v https://MY-SERVICE.MY.TOP.LEVEL.DOMAIN/INTERNAL/ROUTE
*   Trying W.X.Y.Z...
* TCP_NODELAY set
* Connected to MY-SERVICE.MY.TOP.LEVEL.DOMAIN (W.X.Y.Z) port 443 (#0)
* ALPN,offering h2
* ALPN,offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT),TLS handshake,Client hello (1):
* TLSv1.2 (IN),Server hello (2):
* TLSv1.2 (IN),Certificate (11):
* TLSv1.2 (IN),Server key exchange (12):
* TLSv1.2 (IN),Server finished (14):
* TLSv1.2 (OUT),Client key exchange (16):
* TLSv1.2 (OUT),TLS change cipher,Change cipher spec (1):
* TLSv1.2 (OUT),Finished (20):
* TLSv1.2 (IN),Change cipher spec (1):
* TLSv1.2 (IN),Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN,server did not agree to a protocol
* Server certificate:
*  subject: CN=*.MY.TOP.LEVEL.DOMAIN
*  start date: Aug 21 00:00:00 2020 GMT
*  expire date: Sep 20 12:00:00 2021 GMT
*  subjectAltName: host "MY-SERVICE.MY.TOP.LEVEL.DOMAIN" matched cert's "*.MY.TOP.LEVEL.DOMAIN"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
> GET INTERNAL/ROUTE HTTP/1.1
> Host: MY-SERVICE.MY.TOP.LEVEL.DOMAIN
> User-Agent: curl/7.64.1
> Accept: */*
> 
< HTTP/1.1 308 Permanent Redirect
< Content-Type: text/html
< Date: Wed,28 Apr 2021 19:07:57 GMT
< Location: https://MY-SERVICE.MY.TOP.LEVEL.DOMAIN/INTERNAL/ROUTE
< Content-Length: 164
< Connection: keep-alive
< 
<html>
<head><title>308 Permanent Redirect</title></head>
<body>
<center><h1>308 Permanent Redirect</h1></center>
<hr><center>Nginx</center>
</body>
</html>
* Connection #0 to host MY-SERVICE.MY.TOP.LEVEL.DOMAIN left intact
* Closing connection 0

(我希望我捕获了更详细的输出...)

我尝试修改 Nginx 配置以附加以下内容

config:
  use-forwarded-headers: "true"

然后……

config:
  compute-full-forwarded-for: "true"
  use-forwarded-headers: "true"

这些似乎没有什么区别。当时正值中午,所以我没能在回滚之前潜入太深。

我应该看什么,我应该如何调试?

更新:

我希望我已经发布了更新配置的完整副本,因为我会注意到我没有没有正确地应用更改以添加 config.compute-full-forwarded-for: "true"。它需要在 controller 块内,而我已将它放在别处。

添加 compute-full-forwarded-for: "true" 配置后,一切立即开始工作。

解决方法

这是为了更好的可见性而发布的社区 wiki 答案。随意扩展它。

正如@object88 所确认的,问题出在错位的 config.compute-full-forwarded-for: "true" 配置中,该配置位于错误的块中。将它添加到 controller 块解决了这个问题。