问题描述
那很有趣:)
;WITH cte as (
SELECT Messages.Message, Timestamp,
ROW_NUMBER() OVER(PARTITION BY Message ORDER BY Timestamp) AS gn,
ROW_NUMBER() OVER (ORDER BY Timestamp) AS rn
FROM Messages
), cte2 AS (
SELECT Message, Timestamp, gn, rn, gn - rn as gb
FROM cte
), cte3 AS (
SELECT Message, MIN(Timestamp) As Ts, COUNT(1) as Cnt
FROM cte2
GROUP BY Message, gb)
SELECT Message, Cnt FROM cte3
ORDER BY Ts
这是结果集:
Message Cnt
A 2
B 1
A 3
B 1
该查询可能更短,但我以这种方式发布,因此您可以看到正在发生的事情。结果完全符合要求。这是最重要的部分,gn -
rn
该想法是对每个分区中的行进行编号,同时对整个集中的行进行编号,然后,如果从另一个中减去一个,则将获得每个组的“排名”。
;WITH cte as (
SELECT Messages.Message, Timestamp,
ROW_NUMBER() OVER(PARTITION BY Message ORDER BY Timestamp) AS gn,
ROW_NUMBER() OVER (ORDER BY Timestamp) AS rn
FROM Messages
), cte2 AS (
SELECT Message, Timestamp, gn, rn, gn - rn as gb
FROM cte
)
SELECT * FROM cte2
Message Timestamp gn rn gb
A 2015-03-29 00:00:00.000 1 1 0
A 2015-03-29 00:01:00.000 2 2 0
B 2015-03-29 00:02:00.000 1 3 -2
A 2015-03-29 00:03:00.000 3 4 -1
A 2015-03-29 00:04:00.000 4 5 -1
A 2015-03-29 00:05:00.000 5 6 -1
B 2015-03-29 00:06:00.000 2 7 -5
解决方法
我有一张表格,其中存储了发生的消息。通常有一个消息“ A”,有时A被单个消息“
B”分隔。现在,我想对这些值进行分组,以便能够对它们进行分析,例如找到最长的“ A”条纹或“ A”条纹的分布。
我已经尝试了COUNT-OVER查询,但仍在为每条消息计数。
SELECT message,COUNT(*) OVER (ORDER BY Timestamp RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
这是我的示例数据:
Timestamp Message
20150329 00:00 A
20150329 00:01 A
20150329 00:02 B
20150329 00:03 A
20150329 00:04 A
20150329 00:05 A
20150329 00:06 B
我想要以下输出
Message COUNT
A 2
B 1
A 3
B 1