删除或更改ETL 2中的记录

问题描述

我想跟上我以前的POST中的一个问题:Delete or change records in ETL

其中提到的问题已通过以下解决

background: linear-gradient(to top,rgba(26,179,148,.9),#3f9480),url(gear.jpg);

,但是,在测试期间,当表中发生这些移动时,我遇到了一个问题(无法阻止)。

 ; with todelete as (
      select *,count(*) over (partition by label) as cnt,lag(cost) over (partition by label order by time ASC) as lastcost
           ROW_NUMBER() over (partition by label order by time ASC) as r_number
      from Table1
     )
delete from todelete 
    where cnt > 1 and r_number between 1 and (cnt/2)*2 and  cost=ISNULL(lastcost,cost)

对于相同的“标签”,我有两条相同的行(时间戳除外)和相同的“成本”。 仅上述解决方案将删除这两个记录。我只需要删除较旧的那个。

谢谢你

更新:

我的目标是

我有带有记录的表:

label   cost   time
x2       29    14/5/2020 01:00:00
x3       20    14/5/2020 01:02:00
x2       29    15/5/2020 03:12:02

现在我具有删除功能

label   cost   time
x2       29    14/5/2020 01:00:00
x3       20    14/5/2020 01:02:00
x2       30    15/5/2020 03:12:02

我得到了他们想要的桌子:

; with todelete as (
          select *,lag(cost) over (partition by label order by time ASC) as lastcost
               ROW_NUMBER() over (partition by label order by time ASC) as r_number
          from Table1
         )
    delete from todelete 
        where cnt > 1 and r_number between 1 and (cnt/2)*2 and  cost=ISNULL(lastcost,cost)

但是当原始表看起来像这样时会发生问题:

label   cost   time
x3       20    14/5/2020 01:02:00
x2       30    15/5/2020 03:12:02

现在删除功能(如上所述)

我会得到一张桌子:

label   cost   time
x2       29    14/5/2020 01:00:00
x3       20    14/5/2020 01:02:00
x2       29    15/5/2020 03:12:02

使用上述删除功能标签“ X2”的两条记录都将被删除,但我只想删除较旧的记录。

解决方法

没人吗?

我尝试这样: 我无法解决。 在这里,您会看到它将为我删除两条记录(我只想要较旧的一条): https://rextester.com/TLLQ93275

在这种情况下,它可以正常工作,但是如果“ x2”具有相同的价格(例如29),它将同时删除两个条目。 https://rextester.com/RHB70490

更新:

我终于设法解决了这个问题。我添加了另一个排名函数,并对其进行了适当的调节。

; with todelete as (
      select *,count(*) over (partition by label) as cnt,lag(cost) over (partition by label order by time ASC) as lastcost,ROW_NUMBER() over (partition by label order by time ASC) as r_number,ROW_NUMBER() over (partition by label order by time DESC) as r_number2,RANK() over (partition by cost order by time asc) as TEST,Row_NUMBER() over (partition by label order by TIME DESC) as TEST2
       from Table1
     )
DELETE from todelete 
    where (cnt > 1 and r_number between 1 and (cnt/2)*2 and cost=ISNULL(lastcost,cost) AND TEST2 !=1)  OR (cnt>1 AND TEST2<>1 AND r_number2 != 1)

例如,https://rextester.com/DONME54328