尽管节点似乎同意,但在 4.2.1 的 kubernetes 环境中,裂脑条件并未合并

问题描述

我们在带有 openjdk:14-jdk-slim 图像的 kubernetes 环境中使用 hazelcast 4.2.1。在我们的开发环境中,我们只有两个节点,这两个节点有时(在每 5 次部署后不久)最终会出现裂脑情况并且不会合并,尽管它们会找到彼此并就该做什么达成一致:

第一个节点的加入者说第二个节点应该加入。而第二个的加入者不是它应该加入第一个节点。但什么也没有发生。日志每隔几分钟重复一次,并且不会合并集群。

我们是否使用合并策略并不重要。通常情况下,它可以正常工作。

第一个节点的日志:

2021-07-20 09:14:08.306 DEBUG 142 --- [hz.hazelcast-instance.cached.thread-4] c.h.i.cluster.impl.MembershipManager     : [10.41.31.101]:5701 [light-cluster] [4.2.1] Sending member list to the non-master nodes:

Members {size:1,ver:5} [
        Member [10.41.31.101]:5701 - 7263bccd-f330-4b96-8b52-f22db7c7a90e this
]

2021-07-20 09:14:08.446 DEBUG 142 --- [hz.hazelcast-instance.cached.thread-5] c.h.i.cluster.impl.DiscoveryJoiner       : [10.41.31.101]:5701 [light-cluster] [4.2.1] Sending SplitBrainJoinMessage to [10.41.31.102]:5701
2021-07-20 09:14:08.448 DEBUG 142 --- [hz.hazelcast-instance.cached.thread-5] c.h.i.cluster.impl.ClusterJoinManager    : [10.41.31.101]:5701 [light-cluster] [4.2.1] Checking if we should merge to: SplitBrainJoinMessage{packetVersion=4,buildNumber=20210630,memberVersion=4.2.1,clusterVersion=4.2,address=[10.41.31.102]:5701,uuid='9cdd64b4-62c8-4f19-bf29-d3cef4e8e2f6',liteMember=false,memberCount=1,dataMemberCount=1,memberListVersion=1}
2021-07-20 09:14:08.449  INFO 142 --- [hz.hazelcast-instance.cached.thread-5] c.h.i.cluster.impl.ClusterJoinManager    : [10.41.31.101]:5701 [light-cluster] [4.2.1] [10.41.31.102]:5701 should merge to us,both have the same data member count: 1
2021-07-20 09:14:23.277 DEBUG 142 --- [hz.hazelcast-instance.cached.thread-4] c.h.i.p.InternalPartitionService         : [10.41.31.101]:5701 [light-cluster] [4.2.1] Checking partition state,stamp: -5900145379368197006

第二个节点的日志:

2021-07-20 09:14:24.149 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-4] c.h.i.p.InternalPartitionService         : [10.41.31.102]:5701 [light-cluster] [4.2.1] Checking partition state,stamp: -8661523421455686299
2021-07-20 09:14:24.175 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-4] c.h.s.d.integration.DiscoveryService     : [10.41.31.102]:5701 [light-cluster] [4.2.1] Using service name to discover nodes.
2021-07-20 09:14:24.176 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-6] c.h.i.cluster.impl.MembershipManager     : [10.41.31.102]:5701 [light-cluster] [4.2.1] Sending member list to the non-master nodes:

Members {size:1,ver:1} [
        Member [10.41.31.102]:5701 - 9cdd64b4-62c8-4f19-bf29-d3cef4e8e2f6 this
]

2021-07-20 09:14:39.149 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-4] c.h.i.p.InternalPartitionService         : [10.41.31.102]:5701 [light-cluster] [4.2.1] Checking partition state,stamp: -8661523421455686299
2021-07-20 09:14:54.148 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-6] c.h.i.p.InternalPartitionService         : [10.41.31.102]:5701 [light-cluster] [4.2.1] Checking partition state,stamp: -8661523421455686299
2021-07-20 09:15:08.423 DEBUG 141 --- [hz.hazelcast-instance.priority-generic-operation.thread-0] c.h.i.cluster.impl.ClusterJoinManager    : [10.41.31.102]:5701 [light-cluster] [4.2.1] Checking if we should merge to: SplitBrainJoinMessage{packetVersion=4,address=[10.41.31.101]:5701,uuid='7263bccd-f330-4b96-8b52-f22db7c7a90e',memberListVersion=5}
2021-07-20 09:15:08.423  INFO 141 --- [hz.hazelcast-instance.priority-generic-operation.thread-0] c.h.i.cluster.impl.ClusterJoinManager    : [10.41.31.102]:5701 [light-cluster] [4.2.1] We should merge to [10.41.31.101]:5701,both have the same data member count: 1
2021-07-20 09:15:08.424 DEBUG 141 --- [hz.hazelcast-instance.priority-generic-operation.thread-0] c.h.i.c.i.o.SplitBrainMergeValidationOp  : [10.41.31.102]:5701 [light-cluster] [4.2.1] Returning SplitBrainJoinMessage{packetVersion=4,memberListVersion=1} to [10.41.31.101]:5701
2021-07-20 09:15:09.148 DEBUG 141 --- [hz.hazelcast-instance.cached.thread-6] c.h.i.p.InternalPartitionService         : [10.41.31.102]:5701 [light-cluster] [4.2.1] Checking partition state,stamp: -8661523421455686299```

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...