sql-server – 如何递归地找到行之间90天过去的间隙

这在我的C#homeworld中是一项微不足道的任务,但我还没有在sql中创建它并且更愿意解决它基于集合(没有游标).结果集应该来自这样的查询.
SELECT SomeId,MyDate,dbo.udfLastHitRecursive(param1,param2,MyDate) as 'Qualifying'
FROM T

它应该如何运作

I send those three params into a UDF.
The UDF internally use params to fetch related <= 90 days older rows,from a view.
The UDF traverse ‘MyDate’ and return 1 if it should be included in a total calculation.
If it should not,then it return 0.
Named here as “qualifying”.

udf将做什么

List the rows in date order.
Calculate the days between rows.
First row in resultset defaults to Hit = 1.
If the difference is up to 90,
– then pass to next row until the sum of gaps is 90 days (90th day must pass)
When reached,set Hit to 1 and reset gap to 0.
It would also work to instead omit the row from result.

|(column by udf,which not work yet)
Date              Calc_date     MaxDiff   | Qualifying
2014-01-01 11:00  2014-01-01    0         | 1
2014-01-03 10:00  2014-01-01    2         | 0
2014-01-04 09:30  2014-01-03    1         | 0
2014-04-01 10:00  2014-01-04    87        | 0
2014-05-01 11:00  2014-04-01    30        | 1

在上表中,MaxDiff列是上一行中日期的差距.到目前为止我尝试的问题是我不能忽略上面示例中的倒数第二行.

[编辑]
根据评论,我添加一个标签,并粘贴我刚才编译的udf.虽然,只是一个占位符,不会给出有用的结果.

;WITH cte (someid,otherkey,mydate,cost) AS
(
    SELECT someid,cost
    FROM dbo.vGetVisits
    WHERE someid = @someid AND VisitCode = 3 AND otherkey = @otherkey 
    AND CONVERT(Date,mydate) = @VisitDate

    UNION ALL

    SELECT top 1 e.someid,e.otherkey,e.mydate,e.cost
    FROM dbo.vGetVisits AS E
    WHERE CONVERT(date,e.mydate) 
        BETWEEN DateAdd(dd,-90,CONVERT(Date,@VisitDate)) AND CONVERT(Date,@VisitDate)
        AND e.someid = @someid AND e.VisitCode = 3 AND e.otherkey = @otherkey 
        AND CONVERT(Date,e.mydate) = @VisitDate
        order by e.mydate
)

我有一个我单独定义的查询,它更接近我需要的,但阻止了我无法在窗口列上计算的事实.我也尝试了一个类似的,只需要在MyDate上使用LAG()提供或多或少相同的输出,周围有一个日期.

SELECT
    t.Mydate,t.VisitCode,t.Cost,t.someId,t.otherkey,t.MaxDiff,t.DateDiff
FROM 
(
    SELECT *,MaxDiff = LAST_VALUE(Diff.Diff)  OVER (
            ORDER BY Diff.Mydate ASC
                ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
    FROM 
    (
        SELECT *,Diff =  ISNULL(DATEDIFF(DAY,LAST_VALUE(r.Mydate) OVER (
                        ORDER BY r.Mydate ASC
                            ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),r.Mydate),0),DateDiff =  ISNULL(LAST_VALUE(r.Mydate) OVER (
                        ORDER BY r.Mydate ASC
                            ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),r.Mydate)
        FROM dbo.vGetVisits AS r
        WHERE r.VisitCode = 3 AND r.someId = @SomeID AND r.otherkey = @otherkey
    ) AS Diff
) AS t
WHERE t.VisitCode = 3 AND t.someId = @SomeId AND t.otherkey = @otherkey
    AND t.Diff <= 90
ORDER BY
    t.Mydate ASC;

解决方法

当我读到这个问题时,所需的基本递归算法是:

>返回集合中最早日期的行
>将该日期设为“当前”
>查找当前日期之后超过90天的最早日期行
>从步骤2开始重复,直到找不到更多行

使用递归公用表表达式相对容易实现.

例如,使用以下示例数据(基于问题):

DECLARE @T AS table (TheDate datetime PRIMARY KEY);

INSERT @T (TheDate)
VALUES
    ('2014-01-01 11:00'),('2014-01-03 10:00'),('2014-01-04 09:30'),('2014-04-01 10:00'),('2014-05-01 11:00'),('2014-07-01 09:00'),('2014-07-31 08:00');

递归代码是:

WITH CTE AS
(
    -- Anchor:
    -- Start with the earliest date in the table
    SELECT TOP (1)
        T.TheDate
    FROM @T AS T
    ORDER BY
        T.TheDate

    UNION ALL

    -- Recursive part   
    SELECT
        SQ1.TheDate
    FROM 
    (
        -- Recursively find the earliest date that is 
        -- more than 90 days after the "current" date
        -- and set the new date as "current".
        -- ROW_NUMBER + rn = 1 is a trick to get
        -- TOP in the recursive part of the CTE
        SELECT
            T.TheDate,rn = ROW_NUMBER() OVER (
                ORDER BY T.TheDate)
        FROM CTE
        JOIN @T AS T
            ON T.TheDate > DATEADD(DAY,90,CTE.TheDate)
    ) AS SQ1
    WHERE
        SQ1.rn = 1
)
SELECT 
    CTE.TheDate 
FROM CTE
OPTION (MAXRECURSION 0);

结果是:

╔═════════════════════════╗
║         TheDate         ║
╠═════════════════════════╣
║ 2014-01-01 11:00:00.000 ║
║ 2014-05-01 11:00:00.000 ║
║ 2014-07-31 08:00:00.000 ║
╚═════════════════════════╝

使用TheDate作为主键的索引,执行计划非常有效:

您可以选择将其包装在函数中并直接针对问题中提到的视图执行它,但我的直觉反对它.通常,当您从视图中选择行到临时表,在临时表上提供适当的索引,然后应用上面的逻辑时,性能会更好.细节取决于视图的细节,但这是我的一般经验.

为了完整性(并且由ypercube的答案提示),我应该提到我对这类问题的另一个解决方案(直到T-sql获得正确的有序集函数)是一个sqlCLR游标(see my answer here作为该技术的一个例子).这比T-sql游标执行得更好,对于那些具有.NET语言技能并且能够在其生产环境中运行sqlCLR的人来说非常方便.在这种情况下,它可能不会在递归解决方案中提供太多,因为大部分成本都是排序,但值得一提.

相关文章

SELECT a.*,b.dp_name,c.pa_name,fm_name=(CASE WHEN a.fm_n...
if not exists(select name from syscolumns where name=&am...
select a.*,pano=a.pa_no,b.pa_name,f.dp_name,e.fw_state_n...
要在 SQL Server 2019 中设置定时自动重启,可以使用 Window...
您收到的错误消息表明数据库 &#39;EastRiver&#39; 的...
首先我需要查询出需要使用SQL Server Profiler跟踪的数据库标...