spring-boot RabbitHealthIndicator 在关闭 rabbitmq 连接时恢复健康

问题描述

我们使用默认的 spring boot 健康检查 - 也用于监控 rabbitmq。

不幸的是,这并不可靠。例如我们在日志文件中有这个:

2021-02-16 06:49:30.142 [AMQP Connection 10.160.98.21:5672] ERROR o.s.a.r.c.CachingConnectionFactory  - Channel shutdown: connection error; protocol method: #method<connection.clos
e>(reply-code=320,reply-text=CONNECTION_FORCED - broker forced connection closure with reason 'shutdown',class-id=0,method-id=0)
2021-02-16 06:49:30.143 [AMQP Connection 10.160.98.21:5672] ERROR o.s.a.r.c.CachingConnectionFactory  - Channel shutdown: connection error; protocol method: #method<connection.clos
e>(reply-code=320,method-id=0)
2021-02-16 06:49:30.148 [AMQP Connection 10.160.98.21:5672] WARN  c.r.c.impl.ForgivingExceptionHandler  - An unexpected connection driver error occured (Exception message: Connecti
on reset)
2021-02-16 06:49:30.207 [AMQP Connection 10.160.98.21:5672] ERROR o.s.a.r.c.CachingConnectionFactory  - Channel shutdown: connection error; protocol method: #method<connection.clos
e>(reply-code=320,method-id=0)
2021-02-16 06:49:30.208 [AMQP Connection 10.160.98.21:5672] ERROR o.s.a.r.c.CachingConnectionFactory  - Channel shutdown: connection error; protocol method: #method<connection.clos
e>(reply-code=320,method-id=0)
2021-02-16 06:49:30.209 [AMQP Connection 10.160.98.21:5672] ERROR o.s.a.r.c.CachingConnectionFactory  - Channel shutdown: connection error; protocol method: #method<connection.clos
e>(reply-code=320,method-id=0)
2021-02-16 06:49:30.209 [AMQP Connection 10.160.98.21:5672] WARN  c.r.c.impl.ForgivingExceptionHandler  - An unexpected connection driver error occured (Exception message: Connecti
on reset)
2021-02-16 06:49:33.736 [http-nio-8080-exec-6] WARN  o.s.b.a.amqp.RabbitHealthIndicator  - Rabbit health check failed
org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused (Connection refused)
        at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:61)
        at org.springframework.amqp.rabbit.connection.AbstractConnectionFactory.createBareConnection(AbstractConnectionFactory.java:524)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.createConnection(CachingConnectionFactory.java:751)
        at org.springframework.amqp.rabbit.connection.ConnectionFactoryUtils.createConnection(ConnectionFactoryUtils.java:214)
        at org.springframework.amqp.rabbit.core.RabbitTemplate.doExecute(RabbitTemplate.java:2089)
        at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:2062)
        at org.springframework.amqp.rabbit.core.RabbitTemplate.execute(RabbitTemplate.java:2042)
        at org.springframework.boot.actuate.amqp.RabbitHealthIndicator.getVersion(RabbitHealthIndicator.java:49)
        at org.springframework.boot.actuate.amqp.RabbitHealthIndicator.doHealthCheck(RabbitHealthIndicator.java:44)
        at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
....
2021-02-16 06:49:35.433 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-3] WARN  o.s.a.r.l.SimpleMessageListenerContainer  - Consumer raised exception,processin
g can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused (Connection ref
used)
2021-02-16 06:49:35.746 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-4] WARN  o.s.a.r.l.SimpleMessageListenerContainer  - Consumer raised exception,processin
g can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused (Connection ref
used)
2021-02-16 06:49:45.916 [http-nio-8080-exec-6] WARN  o.s.b.a.amqp.RabbitHealthIndicator  - Rabbit health check failed
org.springframework.amqp.AmqpIOException: java.io.IOException
        at org.springframework.amqp.rabbit.support.RabbitExceptionTranslator.convertRabbitAccessException(RabbitExceptionTranslator.java:70)

...
2021-02-16 06:49:55.993 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-5] WARN  o.s.a.r.l.BlockingQueueConsumer  - Failed to declare queue: pubxFileUploadQueue
2021-02-16 06:49:55.994 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-5] WARN  o.s.a.r.l.BlockingQueueConsumer  - Queue declaration failed; retries left=1
org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[pubxFileUploadQueue]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:700)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.passiveDeclarations(BlockingQueueConsumer.java:584)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:571)
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.initialize(SimpleMessageListenerContainer.java:1350)
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1195)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: null
        at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:129)
        at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:125)
        at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:147)
        at com.rabbitmq.client.impl.ChannelN.queueDeclarePassive(ChannelN.java:1012)
        at com.rabbitmq.client.impl.ChannelN.queueDeclarePassive(ChannelN.java:46)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$CachedChannelInvocationHandler.invoke(CachingConnectionFactory.java:1184)
        at com.sun.proxy.$Proxy144.queueDeclarePassive(Unknown Source)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:679)
        ... 5 common frames omitted
Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404,reply-text=NOT_FOUND - home node 'rabbit@rabbit-rabbi
tmq-ha-0.rabbit-rabbitmq-ha-discovery.management.svc.cluster.local' of durable queue 'pubxFileUploadQueue' in vhost '/' is down or inaccessible,class-id=50,method-id=10)
        at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:66)
        at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:36)
        at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:502)
        at com.rabbitmq.client.impl.AMQChannel.privateRpc(AMQChannel.java:293)
        at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:141)
        ... 14 common frames omitted
Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404,method-id=10)
        at com.rabbitmq.client.impl.ChannelN.asyncShutdown(ChannelN.java:517)
        at com.rabbitmq.client.impl.ChannelN.processAsync(ChannelN.java:341)
        at com.rabbitmq.client.impl.AMQChannel.handleCompleteInboundCommand(AMQChannel.java:182)
        at com.rabbitmq.client.impl.AMQChannel.handleFrame(AMQChannel.java:114)
        at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:739)
        at com.rabbitmq.client.impl.AMQConnection.access$300(AMQConnection.java:47)
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:666)
        ... 1 common frames omitted

...
2021-02-16 06:50:01.012 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-6] ERROR o.s.a.r.l.SimpleMessageListenerContainer  - Consumer threw missing queues except
ion,fatal=true
org.springframework.amqp.rabbit.listener.QueuesNotAvailableException: Cannot prepare queue for listener. Either the queue doesn't exist or the broker will not allow us to use it.
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.handleDeclarationException(BlockingQueueConsumer.java:651)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.passiveDeclarations(BlockingQueueConsumer.java:591)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:571)
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.initialize(SimpleMessageListenerContainer.java:1350)
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1195)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[pubxFileUploadQueue]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:700)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.passiveDeclarations(BlockingQueueConsumer.java:584)
        ... 4 common frames omitted
Caused by: java.io.IOException: null
        at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:129)
        at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:125)
        at com.rabbitmq.client.impl.AMQChannel.exnWrappingRpc(AMQChannel.java:147)
        at com.rabbitmq.client.impl.ChannelN.queueDeclarePassive(ChannelN.java:1012)
        at com.rabbitmq.client.impl.ChannelN.queueDeclarePassive(ChannelN.java:46)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$CachedChannelInvocationHandler.invoke(CachingConnectionFactory.java:1184)
        at com.sun.proxy.$Proxy144.queueDeclarePassive(Unknown Source)
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:679)
        ... 5 common frames omitted
Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404,method-id=10)
        at com.rabbitmq.client.impl.ChannelN.asyncShutdown(ChannelN.java:517)
        at com.rabbitmq.client.impl.ChannelN.processAsync(ChannelN.java:341)
        at com.rabbitmq.client.impl.AMQChannel.handleCompleteInboundCommand(AMQChannel.java:182)
        at com.rabbitmq.client.impl.AMQChannel.handleFrame(AMQChannel.java:114)
        at com.rabbitmq.client.impl.AMQConnection.readFrame(AMQConnection.java:739)
        at com.rabbitmq.client.impl.AMQConnection.access$300(AMQConnection.java:47)
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:666)
        ... 1 common frames omitted
2021-02-16 06:50:01.014 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#0-6] ERROR o.s.a.r.l.SimpleMessageListenerContainer  - Stopping container from aborted cons
umer

但即使过了一天,健康检查又回来了

{
  "status": "UP","components": {
    "diskSpace": {
      "status": "UP","details": {
        "total": 48393486336,"free": 39704743936,"threshold": 10485760,"exists": true
      }
    },"livenessState": {
      "status": "UP"
    },"ping": {
      "status": "UP"
    },"rabbit": {
      "status": "UP","components": {
        "amqpAvailabilityTemplate": {
          "status": "UP","details": {
            "version": "3.8.11"
          }
        },"amqpDdcTemplate": {
          "status": "UP","amqpOesbTemplate": {
          "status": "UP","amqpRankingTemplate": {
          "status": "UP","amqpSiglTemplate": {
          "status": "UP","amqpTemplate": {
          "status": "UP","cflAmqpTemplate": {
          "status": "UP","rabbitTemplate": {
          "status": "UP","details": {
            "version": "3.8.11"
          }
        }
      }
    },"readinessState": {
      "status": "UP"
    },"storage": {
      "status": "UP"
    }
  },"groups": ["liveness","readiness"]
}

我是否遗漏了一些配置,或者我是否需要自己实施运行状况检查以发现此类问题?

事实上,我预计当兔子再次可用时连接会恢复。

解决方法

您使用的 RabbitTemplate 是否启用了重试? 在这种情况下,您是否可以尝试覆盖 Rabbit Health 指标并为其提供模板而无需重试 -

@Configuration
public class CustomRabbitHealthIndicatorOverride
        extends CompositeHealthIndicatorConfiguration<RabbitHealthIndicator,RabbitTemplate> {

    @Bean
    public HealthIndicator rabbitHealthIndicator(ConnectionFactory connectionFactory) {
        return createHealthIndicator(new RabbitTemplate(connectionFactory));
    }

}

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...