问题描述
我有一个表,其中包含列Age
,Period
和Year
。 Age
列始终以0开头,并且没有固定的最大值(在此示例中,我使用'Age'从0到30,但范围也可以是0到100等),值{{1 }}和Period
仅在特定年龄出现在特定行中。
然而,Year
和Age
的值出现在Period
处时,更改发生,因此解决方案应该是动态的。用正确的Year
和NULL
填充Period
值的最佳方法是什么?
我正在使用sql Server。
Year
结果应如下所示,Age Period Year
-----------------
0 NULL NULL
1 NULL NULL
2 NULL NULL
3 NULL NULL
4 NULL NULL
5 NULL NULL
6 NULL NULL
7 NULL NULL
8 NULL NULL
9 NULL NULL
10 NULL NULL
11 NULL NULL
12 NULL NULL
13 NULL NULL
14 NULL NULL
15 NULL NULL
16 NULL NULL
17 NULL NULL
18 NULL NULL
19 NULL NULL
20 NULL NULL
21 46 2065
22 NULL NULL
23 NULL NULL
24 NULL NULL
25 NULL NULL
26 51 2070
27 NULL NULL
28 NULL NULL
29 NULL NULL
30 NULL NULL
和Period
的数字应从Year
和Period
的最新已知值增加和/或减少。 / p>
Year
这是我的问题的更新,因为我没有详细说明我的要求:
该解决方案应该能够处理Age Period Year
-----------------
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
,Age
和Period
的不同组合。我的起点始终是已知的Year
,Age
和Period
组合。但是,在我的示例中,组合Year
= 21,Age
= 46和Period
= 2065(或作为第二组合的26 | 51 | 2070)不是静态的。 Year
= 21处的值可以是任何值,例如Age
= 2和Period
=2021。无论是什么组合(Year
,Age
,Period
),解决方案都应填补空白并完成从Year
和Period
的已知值开始向上和向下计数。如果Year
值序列变为负,则解决方案应尽可能返回Period
值。
解决方法
似乎您的年龄和年份总有相同的增量 所以
select age,isnull(period,age +25) Period,isnull(year,age+44) year
from yourtable
或标准功能合并(由Gordon Linoff建议)
select age,coalesce(period,coalesce(year,age+44) year
from yourtable
,
塔贝尔创建代码
create table yourtable ( AGE int,Period int,Year int )
insert into yourtable
Select 0 AS AGE,null As Period,null As Year UNION all
Select 1 AS AGE,null As Year UNION all
Select 2 AS AGE,null As Year UNION all
Select 3 AS AGE,null As Year UNION all
Select 4 AS AGE,null As Year UNION all
Select 5 AS AGE,null As Year UNION all
Select 6 AS AGE,null As Year UNION all
Select 7 AS AGE,null As Year UNION all
Select 8 AS AGE,null As Year UNION all
Select 9 AS AGE,null As Year UNION all
Select 10 AS AGE,null As Year UNION all
Select 11 AS AGE,null As Year UNION all
Select 12 AS AGE,null As Year UNION all
Select 13 AS AGE,null As Year UNION all
Select 14 AS AGE,null As Year UNION all
Select 15 AS AGE,null As Year UNION all
Select 16 AS AGE,null As Year UNION all
Select 17 AS AGE,null As Year UNION all
Select 18 AS AGE,null As Year UNION all
Select 19 AS AGE,null As Year UNION all
Select 20 AS AGE,null As Year UNION all
Select 21 AS AGE,46 As Period,2065 As Year UNION all
Select 22 AS AGE,null As Period,null As Year UNION all
Select 23 AS AGE,null As Year UNION all
Select 24 AS AGE,null As Year UNION all
Select 25 AS AGE,51 As Period,2070 As Year UNION all
Select 26 AS AGE,null As Year UNION all
Select 27 AS AGE,null As Year UNION all
Select 28 AS AGE,null As Year UNION all
Select 29 AS AGE,null As Year UNION all
Select 30 AS AGE,null As Year
**步骤**
- 我们需要获取一个不为期和年的非空值的行。
- 使用age均为这两个列的第一个值。
- 现在只需添加相应的年龄列值并填写完整表格即可。
修复序列号的代码
;with tmp as
(select top 1 * from yourtable where Period is not null and year is not null)
update yourtable
set Period = (tmp.Period - tmp.age) + yourtable.age,year = (tmp.year - tmp.age) + yourtable.age
from yourtable,tmp
OR
Declare @age int,@Year int,@Period int
select @age = age,@Year = year - (age +1),@Period = Period- (AGE +1)
from yourtable where Period is not null and year is not null
update yourtable
set Period =@Period + age,Year =@year + age
from yourtable
,
您最终想要三个具有不同起始值的序列。然后,您只需要计算偏移量并将其添加到age
:
with cte as
(
select age,max(period - age) over () + age as period -- adjusted period,max(yr - age) over () + age as yr -- adjusted yr
from #yourtable
)
select age
-- If a Period value sequence becomes negative the solutions should return NULL,case when period >0 then period end as period,yr
from cte
请参见fiddle
,-希望您可以管理语法错误。但是在这种情况下,我们可以将句点作为计算其他缺失值的起点,因此下面给出的一些逻辑应该起作用。祝你好运!
declare @knownperiod int;
declare @knownperiodage int;
declare @agetop int;
declare @agebottom int;
@knownperiod = select top 1 period from table1 where period is not null
@knownperiodage = select top 1 age from table1 where period is not null
while(@knownperiodage >= 0)
begin
@knownperiod = @knownperiod -1 ;
@knownperiodage = @knownperiodage -1;
update table1 set period = @knownperiod,year = YEAR(GetDate())+@knownperiod-1 where age = @knownperiodage
end
-- now for bottom age
@knownperiod = select top 1 period from table1 where period is null or year is null
@knownperiodage = select top 1 age from table1 where period is null or year is null
while(@knownperiodage <= (Select max(age) from table1))
begin
@knownperiod = @knownperiod +1 ;
@knownperiodage = @knownperiodage +1;
update table1 set period = @knownperiod,year = YEAR(GetDate())+@knownperiod-1 where age = @knownperiodage
end
,
是先计算增量(年龄->时期和年龄->年),然后将这些增量简单地添加到年龄值中的过程吗? 假设年龄和时期之间的差异以及年龄和年份之间的差异在各行中都是一致的(有时只是不填写)。
这样,您可以使用以下代码首先计算增量(PeriodInc,YrInc),然后选择添加了增量的值(请注意,如果期间为负,则为NULL)。
; WITH PeriodInc AS (SELECT TOP 1 Period - Age AS PeriodInc FROM #yourtable WHERE Period IS NOT NULL),YrInc AS (SELECT TOP 1 Yr - Age AS YrInc FROM #yourtable WHERE Yr IS NOT NULL)
SELECT Age,CASE WHEN (Age + PeriodInc) >= 0 THEN (Age + PeriodInc) ELSE NULL END AS Period,Age + YrInc AS Yr
FROM #yourtable
CROSS JOIN PeriodInc
CROSS JOIN YrInc
这里是带有代码的DB_Fiddle
,此解决方案需要4个输入:
- @list_length-(整数)要生成的行数(最多12 ^ 5 = 248,832)
- @start_age-(整数)开始年龄
- @start_period-(整数)开始时间
- @start_year-(整数)开始年
对于输入的任何组合,此代码都会生成请求的输出。如果计算得出“年龄”或“年份”为负,则将其转换为NULL。列表长度的当前限制可以增加到必要的范围。当生成大序列时,使用交叉应用的行创建row_number的技术非常快。在大约500行以上,它总是比基于递归的CTE更快。在行数较小时,两种技术之间的性能差异很小甚至没有。
以下是与示例数据匹配的代码和输出。
输入
declare
@list_length int=31,@start_age int=21,@start_period int=46,@start_year int=2065;
代码
with
n(n) as (select * from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),tally_cte(n) as (
select row_number() over (order by (select null))
from n n1 cross join n n2 cross join n n3 cross join n n4 cross join n n5)
select p.Age,case when p.[Period]<0 then null else p.[Period] end [Period],case when p.[Year]<0 then null else p.[Year] end [Year]
from tally_cte t
cross apply
(select (t.n-1) [Age],(t.n-1)+(@start_period-@start_age) [Period],(t.n-1)+(@start_year-@start_age) [Year]) p
where n<=@list_length;
输出
Age Period Year
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
假设“期间”和“年份”都小于起始“年龄”。如果计算得出的值为负,则该值将替换为NULL。
输入
declare
@list_length int=100,@start_age int=10,@start_period int=5,@start_year int=8;
输出
Age Period Year
0 NULL NULL
1 NULL NULL
2 NULL 0
3 NULL 1
4 NULL 2
5 0 3
6 1 4
7 2 5
8 3 6
9 4 7
10 5 8
11 6 9
12 7 10
...
99 94 97
Imo,这是一种满足所有要求的灵活高效的方法。请让我知道是否有任何问题。
,这听起来像是一个缺口和孤岛的问题,其中“空”行是缺口,非空行是孤岛。
您想填补空白。您的问题有点棘手,因为您没有清楚描述间隙行同时包含前岛和后岛的情况下如何进行操作,以及如果不一致的话该怎么办。
让我假设,如果有一个可用岛,您想从下一个岛中获取值,然后回退到前一个岛。
这是一种使用横向联接来检索下一个和前一个非空行的方法:
select t.age,coalesce(t.period,n.period - n.diff,p.period - p.diff) period,coalesce(t.year,n.year - n.diff,p.year - p.diff) year
from mytable t
outer apply (
select top (1) t1.*,t1.age - t.age diff
from mytable t1
where t1.age > t.age and t1.period is not null and t1.year is not null
order by t1.age
) n
outer apply (
select top (1) t1.*,t1.age - t.age diff
from mytable t1
where t1.age < t.age and t1.period is not null and t1.year is not null
order by t1.age desc
) p
order by t.age
实际上,使用窗口功能可以更有效地执行此操作。我们可以通过建立具有窗口计数的记录组,然后在这些组内进行计算来实现完全相同的逻辑:
select
age,coalesce(
period,max(period) over(partition by grp2) - max(age) over(partition by grp2) + age,max(period) over(partition by grp1) - min(age) over(partition by grp1) + age
) period,coalesce(
year,max(year) over(partition by grp2) - max(age) over(partition by grp2) + age,max(year) over(partition by grp1) - min(age) over(partition by grp1) + age
) year
from (
select t.*,count(period) over(order by age) grp1,count(period) over(order by age desc) grp2
from mytable t
) t
order by age
Demo on DB Fiddle -两个查询都产生:
age | period | year --: | -----: | ---: 0 | 25 | 2044 1 | 26 | 2045 2 | 27 | 2046 3 | 28 | 2047 4 | 29 | 2048 5 | 30 | 2049 6 | 31 | 2050 7 | 32 | 2051 8 | 33 | 2052 9 | 34 | 2053 10 | 35 | 2054 11 | 36 | 2055 12 | 37 | 2056 13 | 38 | 2057 14 | 39 | 2058 15 | 40 | 2059 16 | 41 | 2060 17 | 42 | 2061 18 | 43 | 2062 19 | 44 | 2063 20 | 45 | 2064 21 | 46 | 2065 22 | 47 | 2066 23 | 48 | 2067 24 | 49 | 2068 25 | 50 | 2069 26 | 51 | 2070 27 | 52 | 2071 28 | 53 | 2072 29 | 54 | 2073 30 | 55 | 2074,
您还可以使用递归CTE(它可以处理表中的任何数据变异,但只有一个完全没有填充的期间和年份):
WITH cte AS ( -- get any filled period and year
SELECT TOP 1 period - age delta,[year]-period start_year
FROM tablename
WHERE period is not null and [year] is not null
),seq AS ( --get min and max age values
SELECT MIN(age) as min_age,MAX(age) as max_age
FROM tablename
),go_recursive AS (
SELECT min_age age,min_age+delta period,start_year+min_age+delta year,max_age
FROM seq
CROSS JOIN cte --That will generate the initial first row
UNION ALL
SELECT age + 1,period +1,year + 1,max_age
FROM go_recursive
WHERE age < max_age --This part increments the data from first row
)
SELECT age,period,[year]
FROM go_recursive
OPTION (MAXRECURSION 0)
-- If you know there are some limit of rows in that kind of tables
--use this row count instead 0