Hazelcast 操作分析

问题描述

我有一个以嵌入模式运行 Hazelcast 的应用程序。我已经使用 https://docs.hazelcast.com/imdg/4.1/management/diagnostics.html 启用了诊断功能,它将指标记录到单独的文件中。

-Dhazelcast.diagnostics.enabled=true
-Dhazelcast.diagnostics.metric.level=debug
-Dhazelcast.diagnostics.invocation.sample.period.seconds=300
-Dhazelcast.diagnostics.pending.invocations.period.seconds=300
-Dhazelcast.diagnostics.slowoperations.period.seconds=300
-Dhazelcast.diagnostics.storeLatency.period.seconds=300
-Dhazelcast.diagnostics.metrics.period.seconds=300
-Dhazelcast.diagnostics.memberinfo.period.second=300
-Dhazelcast.diagnostics.directory=/u/tomcat/appn/logs
-Dhazelcast.diagnostics.max.rolled.file.size.mb=200
-Dhazelcast.diagnostics.max.rolled.file.count=5 

这里的问题是在为操作配置文件生成的指标中没有提到正在使用的 iMap。 日志片段如下所示:

 11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,unit=ms,metric=map.totalPutLatency]=102]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalSetLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalGetLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalMaxPutLatency]=94]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalMaxSetLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalMaxGetLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalMaxRemoveLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.totalRemoveLatency]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.lastAccesstime]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.lastUpdateTime]=1623429796822]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,unit=count,metric=map.hits]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.numberOfOtherOperations]=1]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.numberOfEvents]=30]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.getCount]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.putCount]=30]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.setCount]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.removeCount]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.creationTime]=1623429781163]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.ownedEntryCount]=30]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,metric=map.backupEntryCount]=0]
11-06-2021 12:47:56 1623430076148 Metric[[name=employeeCache,unit=bytes,metric=map.ownedEntryMemoryCost]=52775]

但是,当我查看下面的操作配置文件时,它确实显示了正在执行的操作,例如 Get、Put,但没有显示在哪个缓存上,即该缓存的名称

显示操作配置文件的日志:

11-06-2021 13:54:11 1623434051149 Operationsprofiler[
                          com.hazelcast.map.impl.operation.ContainsKeyOperation[
                                  count=21
                                  totalTime(us)=1,637
                                  avg(us)=77
                                  max(us)=589
                                  latency-distribution[
                                          16..31us=9
                                          32..63us=11
                                          256..511us=1]]
                          com.hazelcast.client.impl.operations.GetConnectedClientsOperation[
                                  count=1
                                  totalTime(us)=6,896
                                  avg(us)=6,896
                                  max(us)=6,896
                                  latency-distribution[
                                          4096..8191us=1]]
                          com.hazelcast.spi.impl.eventservice.impl.operations.Registrationoperation[
                                  count=125
                                  totalTime(us)=2,879
                                  avg(us)=23
                                  max(us)=108
                                  latency-distribution[
                                          8..15us=100
                                          16..31us=21
                                          32..63us=3
                                          64..127us=1]]
                          com.hazelcast.map.impl.operation.MapGetInvalidationMetaDataOperation[
                                  count=191
                                  totalTime(us)=140,066
                                  avg(us)=733
                                  max(us)=40,709
                                  latency-distribution[
                                          32..63us=7
                                          64..127us=31
                                          128..255us=79
                                          256..511us=61
                                          512..1023us=9
                                          1024..2047us=1
                                          2048..4095us=1
                                          8192..16383us=1
                                          16384..32767us=1]]
                          com.hazelcast.map.impl.query.QueryOperation[
                                  count=178
                                  totalTime(us)=276,769
                                  avg(us)=1,554
                                  max(us)=21,650
                                  latency-distribution[
                                          64..127us=32
                                          128..255us=10
                                          256..511us=24
                                          512..1023us=65
                                          1024..2047us=38
                                          2048..4095us=3
                                          4096..8191us=2
                                          8192..16383us=4]]
                          com.hazelcast.map.impl.operation.Putoperation[
                                  count=8,905
                                  totalTime(us)=555,519
                                  avg(us)=62
                                  max(us)=95,164
                                  latency-distribution[
                                          8..15us=377
                                          16..31us=6,869
                                          32..63us=1,176
                                          64..127us=408
                                          128..255us=51
                                          256..511us=15
                                          512..1023us=3
                                          1024..2047us=1
                                          8192..16383us=4
                                          32768..65535us=1]]]

这里的要求是在hazelcast的诊断日志中显示操作分析器内的缓存名称

解决方法

诊断中缓存的名称并不真正相关,因为诊断日志是关于提供有关操作的信息。这些操作是到达成员的事件(获取、放置、查询等),诊断是关于如何处理这些事件,即内部处理线程如何使用这些事件。地图/缓存名称也无关紧要,因为所有地图/缓存都是分布式的,即数据分布在集群中的所有成员中。如果某个特定操作很慢,那么这意味着它对于存储在该成员上的所有映射/缓存中的该分区中的数据来说很慢。

但是,如果这对您的案例绝对必要,并且您热衷于构建它,那么请随时创建 PR 并提交至 https://github.com/hazelcast/hazelcast