通过JGroups协议使用共享存储HA策略在Artemis Cluster中实现高可用性和故障转移

问题描述

在Artemis ActiveMQ的文档中指出,如果为复制HA策略配置了高可用性,那么您可以指定一组备用服务器可以连接到的活动服务器。这是通过在group-name的master和slave元素中配置broker.xml来完成的。备份服务器将仅连接到共享相同节点组名称的实时服务器。

但是在共享存储中,没有group-name这样的概念。我很迷惑。如果我必须通过jgroups中的共享存储来实现高可用性,那么该怎么做。

再次尝试通过复制HA策略进行操作,前提是group-name集群已形成并且故障转移正在运行,但是我得到警告说:

2020-10-02 16:35:21,517 WARN  [org.apache.activemq.artemis.core.client] AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted,in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This Could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:21,in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This Could occur if you have a backup node active at the same time as its live node. nodeID=220da24b-049c-11eb-8da6-0050569b585d
2020-10-02 16:35:25,350 WARN  [org.apache.activemq.artemis.core.server] AMQ224078: The size of duplicate cache detection (<id_cache-size/>) appears to be too large 20,000. It should be no greater than the number of messages that can be squeezed into confirmation window buffer (<confirmation-window-size/>) 32,000.

解决方法

正如名称“ shared-store”所示,活动代理和备份代理成为逻辑对,可以支持高可用性和故障转移,因为它们共享相同的数据存储。因为它们共享同一数据存储,所以不需要任何类型的group-name配置。这样的选择会造成混乱,多余并最终无用。

之所以存在JGroups配置(和更常见的cluster-connection)是因为两个代理需要彼此交换有关各自网络位置的信息,以便实时代理可以通知客户端如何在备份时连接到备份失败。

关于网络上重复的节点ID的WARN消息...在故障转移或故障回复期间,您可能会收到一次警告消息,可能两次,但是如果您看到的消息多于此,则说明存在某些问题错误。如果您正在使用共享存储,则表明共享文件系统上的锁存在问题。如果您使用的是复制,则表明存在潜在的配置错误或脑裂。