将连续日期组成的岛屿分组,包括缺少周末

问题描述

我有一个大型数据集,其中包含某些操作的日期,并且我正在尝试计算连续的日期。经过四处搜寻,我发现:https://www.sqlservercentral.com/articles/group-islands-of-contiguous-dates-sql-spackle几乎完美,它确实可以满足我的需求。不幸的是,由于我的数据集,我有一个例外业务规则,我需要查询来执行:如果员工的最后一个日期是星期五,而下一个开始日期是最近的星期一,则应该将这些日期分组到同一“岛”而不会增加天数。这就是示例数据集的意思:

CREATE TABLE Actions    
   ([Employee] varchar(2),[ActionDate] date)    
;       

INSERT INTO Actions    
    ([Employee],[ActionDate])

VALUES    
    ('AA','2019-01-03'),('AA','2019-01-04'),'2019-01-07'),'2019-01-08'),('BB','2019-08-01'),'2019-08-02'),'2019-08-03'),'2019-08-04'),'2019-08-05'),'2019-08-06'),('CC','2019-09-09'),'2019-09-10'),'2019-09-11'),'2019-09-12'),'2019-09-13'),'2019-09-16'),'2019-09-17'),'2019-09-18')    
;

然后我找到的查询更改了列以匹配示例:

WITH    
days As    
(    
SELECT Employee,ActionDate,DATEADD(dd,-ROW_NUMBER() OVER  (PARTITION BY Employee ORDER BY Employee,ActionDate),ActionDate) As grouping    
FROM Actions    
GROUP BY Employee,ActionDate    
)    
SELECT Employee,MIN(ActionDate) AS ActionStart,MAX(ActionDate) As ActionEnd,DATEDIFF(dd,MIN(ActionDate),MAX(ActionDate))+1 As ActLength    
FROM days    
GROUP BY Employee,grouping    
ORDER BY Employee,ActionStart

结果是:

+-------+----------+-------------+------------+-----------+
| RowNr | Employee | ActionStart | ActionEnd  | ActLength |
+-------+----------+-------------+------------+-----------+
|     1 | AA       |  03.01.2019 | 04.01.2019 |         2 |
|     2 | AA       |  07.01.2019 | 08.01.2019 |         2 |
|     3 | BB       |  01.08.2019 | 06.08.2019 |         6 |
|     4 | CC       |  09.09.2019 | 13.09.2019 |         5 |
|     5 | CC       |  16.09.2019 | 18.09.2019 |         3 |
+-------+----------+-------------+------------+-----------+

在此示例中,员工AA的结束日期为4.1.2019星期五,而7.1.2019的开始日期是最近的星期一。 CC也有一个结束日期为星期五13.9.2019的下一个开始日期是最近的星期一16.9.2019。它应“合并”这些日期而不增加ActLength。因此,理想的结果将是:

+-------+----------+-------------+------------+-----------+
| RowNr | Employee | ActionStart | ActionEnd  | ActLength |
+-------+----------+-------------+------------+-----------+
|     1 | AA       |  03.01.2019 | 08.01.2019 |         4 |
|     2 | BB       |  01.08.2019 | 06.08.2019 |         6 |
|     3 | CC       |  09.09.2019 | 18.09.2019 |         8 |
+-------+----------+-------------+------------+-----------+

有人知道可以为这种SQL查询创建这样的规则吗?我试着环顾四周,通常人们希望排除周末。在此先感谢大家。

解决方法

我发现更容易使用lag()和窗口求和来实现所需的逻辑:

select employee,min(actionDate) actionStart,max(actionDate) actionEnd,count(*) actionLength
from (
    select 
        a.*,sum(
            case when actionDate = dateadd(day,1,lagActionDate) 
                or (actionDate = dateadd(day,3,lagActionDate) and datename(weekday,actionDate) = 'Monday')
            then 0 else 1 end
        ) over(partition by employee order by actionDate) grp
    from (
        select 
            a.*,lag(actionDate) over(partition by employee order by actionDate) lagActionDate
        from actions a
    ) a
) a
group by employee,grp

Demo on DB Fiddle

employee | actionStart | actionEnd  | actionLength
:------- | :---------- | :--------- | -----------:
AA       | 2019-01-03  | 2019-01-08 |            4
BB       | 2019-08-01  | 2019-08-06 |            6
CC       | 2019-09-09  | 2019-09-18 |            8

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...