问题描述
我在Splunk中发生以下事件:
_time Agent_Hostname alarm status
2020-08-23T03:04:05.000-0700 m50-ups.a_domain upsAlarmOnBypass raised
2020-08-23T03:07:16.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:07:16.000-0700 m50-ups.a_domain upsAlarmInputBad raised
2020-08-23T03:07:39.000-0700 m50-ups.a_domain upsAlarmOnBypass raised
2020-08-23T03:07:39.000-0700 m50-ups.a_domain upsAlarmLowBattery raised
2020-08-23T03:08:17.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:09:24.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:10:31.000-0700 m50-ups.a_domain upsAlarmOnBattery cleared
2020-08-23T03:10:32.000-0700 m50-ups.a_domain upsAlarmInputBad cleared
2020-08-23T03:11:12.000-0700 m50-ups.a_domain upsAlarmLowBattery cleared
2020-08-23T03:19:06.000-0700 m50-ups.a_domain upsAlarmInputBad raised
2020-08-23T03:19:06.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:19:13.000-0700 m50-ups.a_domain upsAlarmLowBattery raised
2020-08-23T03:20:10.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:21:16.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:22:22.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:23:29.000-0700 m50-ups.a_domain upsTrapOnBattery raised
2020-08-23T03:24:28.000-0700 m50-ups.a_domain upsAlarmInputBad cleared
2020-08-23T03:24:28.000-0700 m50-ups.a_domain upsAlarmOnBattery cleared
2020-08-23T03:25:09.000-0700 m50-ups.a_domain upsAlarmLowBattery cleared
2020-08-23T03:25:58.000-0700 m50-ups.a_domain upsAlarmOnBypass cleared
我的问题是如何计算每个主机和每种警报类型的事件持续时间记录,例如, 从以上事件中,我将通过算法获得以下信息,而不仅仅是硬编码特定示例中的值:
start end Agent_Hostname alarm
2020-08-23T03:04:05.000-0700 2020-08-23T03:25:58.000-0700 m50-ups.a_domain upsAlarmOnBypass
2020-08-23T03:07:16.000-0700 m50-ups.a_domain upsTrapOnBattery
2020-08-23T03:07:16.000-0700 2020-08-23T03:24:28.000-0700 m50-ups.a_domain upsAlarmInputBad
2020-08-23T03:07:39.000-0700 2020-08-23T03:25:09.000-0700 m50-ups.a_domain upsAlarmLowBattery
其中start是最早发出主机警报的最早时间,并且 结束是清除同一警报/主机的时间。
我的第二个问题是如何在那些封闭的跨度中找到最大的持续时间跨度,而忽略那些没有结束时间的跨度。
我的问题是如何在Splunk框架内实现目标?
解决方法
println
命令可以处理大多数情况。我唯一无法做的就是显示出色的警报。
println