在群集模式下无法访问覆盖网络中跨节点的服务中的端口

问题描述

我使用以下撰写文件进行堆栈部署

version: '3.8'
x-deploy: &Deploy
  replicas: 1
  placement: &DeployPlacement
    max_replicas_per_node: 1
  restart_policy:
    max_attempts: 15
    window: 60s
  resources: &DeployResources
    reservations: &DeployResourcesReservations
      cpus: '0.05'
      memory: 10M
services:
  serv1:
    image: alpine
    networks:
      - test_nw
    deploy:
      <<: *Deploy
    entrypoint: ["tail","-f","/dev/null"]
  serv2:
    image: nginx
    networks:
      - test_nw
    deploy:
      <<: *Deploy
      placement:
        <<: *DeployPlacement
        constraints:
          - "node.role!=manager"
    expose: # deprecated,but I leave it here anyway
      - "80"
networks:
  test_nw:
    name: test_nw
    driver: overlay

为了方便起见,我将使用test_serv1中通过container运行的host1test_serv2中通过container2运行的host2对于该端口的其余部分,因为实际的主机名和容器名对我来说一直在变化。

当我进入test_serv1的外壳时,当我ping serv2时会发生以下情况:

ubuntu@host1:~$ sudo docker exec -it test_serv1.1.container1 ash
/ # ping serv2
PING serv2 (10.0.7.5): 56 data bytes
64 bytes from 10.0.7.5: seq=0 ttl=64 time=0.084 ms

但是,检查container2时显示的container2的ip是10.0.7.6

ubuntu@host2:~$ sudo docker inspect test_serv2.1.container2
[
    {
****************
        "NetworkSettings": {
            "Bridge": "","HairpinMode": false,"LinkLocalIPv6Address": "","LinkLocalIPv6PrefixLen": 0,"Ports": {
                "80/tcp": null
            },****************
            "Networks": {
                "test_nw": {
                    "IPAMConfig": {
                        "IPv4Address": "10.0.7.6"
                    },"Links": null,"Aliases": [
                        "80c06bb29a42"
                    ],"NetworkID": "sp56aiqxnt56yglsd8mc1zqpv","EndpointID": "dac52f1d7fa148f5acac20f89d6b709193b3c11fc90201424cd052785121e706","Gateway": "","IPAddress": "10.0.7.6","IPPrefixLen": 24,"IPv6Gateway": "","GlobalIPv6Address": "","GlobalIPv6PrefixLen": 0,"MacAddress": "02:42:0a:00:07:06",****************
            }
        }
    }
]

我可以看到container2正在所有接口上监听端口80,它本身可以ping 10.0.7.5和10.0.7.6(!!),并且可以访问两个ips上的端口80(!!)

ubuntu@host2:~$ sudo docker exec -it test_serv2.1.container2 bash
root@80c06bb29a42:/# ping 10.0.7.5
PING 10.0.7.5 (10.0.7.5) 56(84) bytes of data.
64 bytes from 10.0.7.5: icmp_seq=1 ttl=64 time=0.093 ms
64 bytes from 10.0.7.5: icmp_seq=2 ttl=64 time=0.094 ms
^C
--- 10.0.7.5 ping statistics ---
2 packets transmitted,2 received,0% packet loss,time 8ms
rtt min/avg/max/mdev = 0.093/0.093/0.094/0.009 ms
root@80c06bb29a42:/# ping 10.0.7.6
PING 10.0.7.6 (10.0.7.6) 56(84) bytes of data.
64 bytes from 10.0.7.6: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from 10.0.7.6: icmp_seq=2 ttl=64 time=0.059 ms
64 bytes from 10.0.7.6: icmp_seq=3 ttl=64 time=0.053 ms
^C
--- 10.0.7.6 ping statistics ---
3 packets transmitted,3 received,time 50ms
rtt min/avg/max/mdev = 0.035/0.049/0.059/0.010 ms
root@80c06bb29a42:/# netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name    
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      0          33110      1/nginx: master pro 
tcp        0      0 127.0.0.11:35491        0.0.0.0:*               LISTEN      0          32855      -                   
tcp6       0      0 :::80                   :::*                    LISTEN      0          33111      1/nginx: master pro 
udp        0      0 127.0.0.11:43477        0.0.0.0:*                           0          32854      -                   
root@80c06bb29a42:/# curl 10.0.7.5:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma,Verdana,Arial,sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page,the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@80c06bb29a42:/# curl 10.0.7.6:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma,the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@80c06bb29a42:/# 

但是,当我尝试从container1进行以下操作时,我只是想将笔记本电脑扔在墙上,因为我无法弄清别人怎么也没有遇到过这样的问题和/或提出过这样的问题: /

ubuntu@host1:~$ sudo docker exec -it test_serv1.1.container1 ash
/ # ping serv2
PING serv2 (10.0.7.5): 56 data bytes
64 bytes from 10.0.7.5: seq=0 ttl=64 time=0.084 ms
64 bytes from 10.0.7.5: seq=1 ttl=64 time=0.086 ms
^C
--- serv2 ping statistics ---
2 packets transmitted,2 packets received,0% packet loss
round-trip min/avg/max = 0.084/0.085/0.086 ms
/ # curl serv2:80
^C
/ # curl --max-time 10 serv2:80
curl: (28) Connection timed out after 10001 milliseconds
/ # ping test_serv2
PING test_serv2 (10.0.7.5): 56 data bytes
64 bytes from 10.0.7.5: seq=0 ttl=64 time=0.071 ms
64 bytes from 10.0.7.5: seq=1 ttl=64 time=0.064 ms
64 bytes from 10.0.7.5: seq=2 ttl=64 time=0.125 ms
^C
--- test_serv2 ping statistics ---
3 packets transmitted,3 packets received,0% packet loss
round-trip min/avg/max = 0.064/0.086/0.125 ms
/ # curl --max-time 10 test_serv2:80
curl: (28) Connection timed out after 10001 milliseconds
/ # ping 10.0.7.6
PING 10.0.7.6 (10.0.7.6): 56 data bytes
^C
--- 10.0.7.6 ping statistics ---
87 packets transmitted,0 packets received,100% packet loss
/ # curl --max-time 10 10.0.7.6:80
curl: (28) Connection timed out after 10001 milliseconds
/ # 

我已检查所有docker端口(TCP 2376、2377、7946、80和UDP 7946、4789)在两个节点上均已打开。

这是怎么回事?任何帮助,我们都感激不尽!

解决方法

我将其发布给可能因为没有答案而来找人的人。

要考虑的几件事(即使问题中已提及):

  1. 请确保再次打开所有端口。彻底检查iptables,即使您只设置了一次。 Docker引擎似乎更改了配置,并且如果您在Docker启动后打开端口,则有时会使其处于不可用状态(重新启动将无法修复,您需要硬停止->重置iptables->启动docker ce)
  2. 确保计算机的本地IP地址没有冲突。这很重要。虽然我无法描述它,但您可以尝试了解各种IP类别,看看是否存在任何冲突。
  3. 可能是最琐碎但几乎总是被排除在外的指令:请记住始终使用--advertise-addr--listen-addr来初始化或加入一群。 --advertise-addr应该是面向公众的IP地址(即使不是面向Internet的,它也是其他主机用来访问此主机的IP地址)。 --listen-addr的记录不够好,但这必须是docker应该绑定到的接口的IP。

经过以上介绍,请注意, AWS Ec2 在跨提供商主机上不能很好地发挥作用。如果您的计算机分布在提供商之间(例如,IBM,Azure,GCP等),则Ec2会在那里宠坏游戏。我对它的完成方式非常好奇(必须是对网络的低级侵权),但是我花了很多时间试图使它起作用,但没有成功。

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...