SQL:在分区上使用LEAD / LAG和/或ROW_NUMBER查找2个事件之间的持续时间

问题描述

我有一张桌子,上面有一系列按车辆和时间划分的事件。 车辆已接通或断开了低谷时间。

在计算连接事件(connected = 1)和随后的断开事件(connected = 0)之间的持续时间时,我很困惑

我想在分区上使用sqlLEADLAG。 我想知道如何对数据进行分区以实现此目标。

当然,VehicleId是第一个候选者。什么是第二个计算字段?

样本数据:

CREATE TABLE #Events (VehicleId int,connected bit,Time DateTime);

INSERT INTO #Events (VehicleId,connected,Time)
VALUES(25931,'2020-10-13 16:02:10.117'),(25931,1,'2020-10-13 11:32:39.213'),'2020-10-13 10:04:29.470'),(25925,'2020-10-13 07:41:31.637'),'2020-10-13 06:06:22.600'),'2020-10-13 05:23:19.433'),(25927,'2020-10-13 01:01:36.460'),'2020-10-13 17:55:10.380'),'2020-10-13 12:14:10.837'),'2020-10-13 10:53:54.527'),'2020-10-13 09:06:52.063'),'2020-10-13 08:32:45.230'),'2020-10-13 06:42:37.627'),'2020-10-13 05:12:08.070'),'2020-10-13 04:42:23.887'),'2020-10-13 00:56:36.090')

SELECT * FROM #Events ORDER BY Time ASC

DROP TABLE #Events

选择查询结果:

VehicleId   connected   Time
25927   0   2020-10-13 00:56:36.090
25927   1   2020-10-13 01:01:36.460
25927   0   2020-10-13 04:42:23.887
25925   0   2020-10-13 05:12:08.070
25931   1   2020-10-13 05:23:19.433
25925   1   2020-10-13 06:06:22.600
25925   0   2020-10-13 06:42:37.627
25925   1   2020-10-13 07:41:31.637
25931   0   2020-10-13 08:32:45.230
25925   0   2020-10-13 09:06:52.063
25931   1   2020-10-13 10:04:29.470
25931   0   2020-10-13 10:53:54.527
25931   1   2020-10-13 11:32:39.213
25931   0   2020-10-13 12:14:10.837
25931   0   2020-10-13 16:02:10.117
25931   0   2020-10-13 17:55:10.380

编辑: 我期望的结果集类似

VehicleId,Duration (min)
25927,221

针对以下两个事件:

25927   1   2020-10-13 01:01:36.460
25927   0   2020-10-13 04:42:23.887

对于每个车辆ID和一对已连接/已断开的连接,依此类推。

谢谢。

编辑2:根据评论,FirsT_VALUE / LAST_VALUE不合适。问题已更新。

解决方法

看起来 就像您想对“过渡”行上的时差求和,即车辆从断开状态变为连接状态。如果是这样,您可以使用lag()

select vehicleid,sum(datediff(minute,lag_time,time)) sum_diff
from (
    select e.*,lag(connected) over(partition by vehicleid order by time) lag_connected,lag(time)      over(partition by vehicleid order by time) lag_time
    from #events e
) e
where connected = 0 and lag_connected = 1
group by vehicleid

对于您的示例数据, this returns

vehicleid | sum_diff
--------: | -------:
    25925 |      121
    25927 |      221
    25931 |      280
,

假设一个1总是 ,然后是该车辆的0,那么您可以使用LEAD来获取下次,然后SUM DATEDIFF

WITH CTE AS(
    SELECT VehicleId,connected,[Time],--Time is a data type,and doesn't have a date portion,I would suggest using a different name
           LEAD(Time) OVER (PARTITION BY VehicleID ORDER BY [Time]) AS NextTime
    FROM #Events E
    WHERE VehicleID = 25927)
SELECT VehicleID,SUM(DATEDIFF(MINUTE,NextTime)) AS Duration
FROM CTE
WHERE Connected = 1
GROUP BY VehicleId;
,

此解决方案不使用FIRST_VALUE / LAST_VALUE。

;
WITH Ranked AS
(
    SELECT 
        *,DateRowNumber = ROW_NUMBER()OVER(ORDER BY Time)
    FROM  #Events            
),Joined AS
(
    SELECT 
        *,JoiningId = CASE WHEN connected=1 THEN LEAD(DateRowNumber) OVER(PARTITION BY VehicleId ORDER BY Time) ELSE NULL END  
    FROM Ranked 
)
SELECT 
    J.VehicleId,J.Time,R.Time,DifferencetInSeconds = DATEDIFF(SECOND,R.Time) 
FROM 
    Joined J
    INNER JOIN Ranked R ON r.DateRowNumber = J.JoiningId
ORDER BY 
    J.VehicleId,J.Time

    SELECT * FROM #Events ORDER BY VehicleID,Time


VehicleId   Time                    Time                    DifferencetInSeconds
----------- ----------------------- ----------------------- --------------------
25925       2020-10-13 06:06:22.600 2020-10-13 06:42:37.627 2175
25925       2020-10-13 07:41:31.637 2020-10-13 09:06:52.063 5121
25927       2020-10-13 01:01:36.460 2020-10-13 04:42:23.887 13247
25931       2020-10-13 05:23:19.433 2020-10-13 08:32:45.230 11366
25931       2020-10-13 10:04:29.470 2020-10-13 10:53:54.527 2965
25931       2020-10-13 11:32:39.213 2020-10-13 12:14:10.837 2491