问题描述
我有这个开始的查询,但是在如何对offline_mins中的差异求和时陷入困境,即需要在where子句中添加“和sum(offline_mins)> = 20”
SELECT
userid,connected,LAG(recordeddt) OVER(PARTITION BY userid
ORDER BY userid,recordeddt) AS offline_period,DATEDIFF(minute,recordeddt),recordeddt) offline_mins
FROM device_data where connected=0;
谢谢。
解决方法
这听起来像是一个空白问题,您希望将具有相同用户名和状态的相邻行组合在一起。
首先,这是一个计算孤岛的查询:
select userid,connected,min(recordeddt) startdt,max(lead_recordeddt) enddt,datediff(min(recordeddt),max(lead_recordeddt)) duration
from (
select dd.*,row_number() over(partition by userid order by recordeddt) rn1,row_number() over(partition by userid,connected order by recordeddt) rn2,lead(recordeddt) over(partition by userid order by recordeddt) lead_recordeddt
from device_data dd
) dd
group by userid,rn1 - rn2
现在,假设您希望每天 至少离线20分钟的用户。您可以每天细分岛屿,并使用having
子句进行过滤:
select userid
from (
select recordedday,userid,max(lead_recordeddt)) duration
from (
select dd.*,v.*,row_number() over(partition by v.recordedday,userid order by recordeddt) rn1,lead(recordeddt) over(partition by v.recordedday,userid order by recordeddt) lead_recordeddt
from device_data dd
cross apply (values (convert(date,recordeddt))) v(recordedday)
) dd
group by convert(date,recordeddt),rn1 - rn2
) dd
group by userid
having count(distinct case when connected = 0 and duration >= 20 then recordedday end) = count(distinct recordedday)
,
如上所述,这是一个差距和孤岛的问题。这是我的想法,使用简单的滞后函数创建组,过滤出连接的行,然后处理日期范围。
CREATE TABLE #tmp(ID int,UserID int,dt datetime,connected int)
INSERT INTO #tmp VALUES
(1,1,'11/2/20 10:00:00',1),(2,'11/2/20 10:05:00',0),(3,'11/2/20 10:10:00',(4,'11/2/20 10:15:00',(5,'11/2/20 10:20:00',(6,2,(7,(8,(9,(10,(11,'11/2/20 10:25:00',(12,'11/2/20 10:30:00',0)
SELECT UserID,DATEDIFF(minute,MIN(DT),MAX(DT)) OFFLINE_MINUTES
FROM
(
SELECT *,SUM(CASE WHEN connected <> LG THEN 1 ELSE 0 END) OVER (ORDER BY UserID,dt) grp
FROM
(
select *,LAG(connected,connected) OVER(PARTITION BY UserID ORDER BY UserID,dt) LG
from #tmp
) x
) y
WHERE connected <> 1
GROUP BY UserID,grp,connected
HAVING DATEDIFF(minute,MAX(DT)) >= 20