问题描述
请帮助我解决以下sql(MS sql Server 2017)中的任务。在Excel中很简单,但在sql中却非常复杂。
有一张表格,其中包含客户及其活动,按天划分:
client 1may 2may 3may 4may 5may other days
client1 0 0 0 0 0 ...
client2 0 0 0 0 0 ...
client3 0 0 0 0 0 ...
client4 1 1 1 1 1 ...
client5 1 1 1 0 0 ...
有必要创建相同的表(行和列的数量相同),但是要根据以下规则将值转换为新的值: 当前日期=
A)如果一天前一周的所有日常值(包括当前值)为1,则为1
B)如果一天前一周的所有日常值(包括当前值)为0,则为0
C)如果值不同,那么我们将保留前一天的状态(如果未知前一天的状态,例如,客户是新客户,则为0)
在Excel中,我使用以下公式执行此操作:= IF(AND(AF2 = AE2; AE2 = AD2; AD2 = AC2; AC2 = AB2; AB2 = AA2; AA2 = Z2)); current_day_value; IF(prevIoUs_day_value =“ “; 0; prevIoUs_day_value))。
带有excel文件is attached.的示例
非常感谢您。
解决方法
首先,将日期作为列永远不是一个好主意。
因此,步骤1将列转换为行。在另一个世界中,要构建一个具有三列的表格
```
client date Value
client1 May1 0
client1 May2 0
client1 May3 0
.... ... ..
client4 May1 1
client4 May2 1
client4 May3 1
.... ... ..
```
步骤2通过使用日期字段执行所需的所有计算。
,在任何情况下,您基本上都会始终保留前一天的状态(空值除外)。
因此,假设第一列为1may,我会做类似的事情(oracle语法,也可以在sql server中工作)
Insert into newTable (client,1may,2may,....) select (client,coalesce(1may,0),coalesce (2may,.... from oldTable;
无论如何,我也认为将日期设置为关系表的列不是一种好习惯。
,您将为此感到困扰,因为大多数SQL品牌不允许“任意透视”,也就是说,您需要指定要在透视上显示的列-而Excel只是为此而做您。 SQL可以做到这一点,但它需要动态SQL,它会变得非常复杂,而且非常烦人。
我建议您仅使用sql来构建数据,然后再使用excel或SSRS(就像您在TSQL中一样)来进行可视化。
无论如何。我认为这可以满足您的要求:
WITH Data AS (
SELECT * FROM (VALUES
('Client 1',CONVERT(DATE,'2020-05-04'),1),('Client 1','2020-05-05'),'2020-05-06'),'2020-05-07'),'2020-05-08'),'2020-05-09'),'2020-05-10'),'2020-05-11'),'2020-05-12'),('Client 2',1)
) x (Client,RowDate,Value)
)
SELECT
Client,Value,CASE
WHEN OnesBefore = DaysInWeek THEN 1
WHEN ZerosBefore = DaysInWeek THEN 0
ELSE PreviousDayValue
END As FinalCalculation
FROM (
-- This set uses windowing to calculate the intermediate values
SELECT
*
-- The count of the days present in the data,as part of the week may be missing we can't assume 7
-- We only count up to this day,so its in line with the other parts of the calculation,COUNT(RowDate) OVER (PARTITION BY Client,WeekCommencing ORDER BY RowDate) AS DaysInWeek
-- Count up the 1's for this client and week,in date order,up to (and including) this date,COUNT(IIF(Value = 1,1,NULL)) OVER (PARTITION BY Client,WeekCommencing ORDER BY RowDate) AS OnesBefore
-- Count up the 0's for this client and week,COUNT(IIF(Value = 0,WeekCommencing ORDER BY RowDate) AS ZerosBefore
-- get the previous days value,or 0 if there isnt one,COALESCE(LAG(Value) OVER (PARTITION BY Client,WeekCommencing ORDER BY RowDate),0) AS PreviousDayValue
FROM (
-- This set adds a few simple values in that we can leverage later
SELECT
*,DATEADD(DAY,-DATEPART(DW,RowDate) + 1,RowDate) As WeekCommencing
FROM Data
) AS DataWithExtras
) AS DataWithCalculations
由于您尚未指定表布局,因此我不知道在示例中要使用的表名和字段名。希望这是正确的,您可以弄清楚如何使用所拥有的内容单击它-如果不正确,请发表评论
我也将注意到,我故意使这变得冗长。如果您不知道“ OVER”子句是什么,则需要阅读以下内容:https://www.sqlshack.com/use-window-functions-sql-server/。要点是,他们进行聚合时实际上并没有将行挤在一起。
编辑:调整了计算方式,使其能够考虑一周中的任意天数
,非常感谢所有人,尤其是David和Massimo,这促使我重新组织了数据。
--we join clients and dates each with each and label clients with 'active' or 'inactive'
with a as (
select client,dates
from (select distinct client from dbo.clients) a
cross join (select dates from dates) b
),b as (
select date,1 end active,client
from clients a
join dbo.dates b on a.id = b.id
)
select client,a.dates,isnull(b.active,0) active
into #tmp2
from a
left join b on a.client= b.client and a.dates = b.dates
--declare variables - for date start and for loop
declare @min_date date = (select min(dates) from #tmp2);
declare @n int = 1
declare @row int = (select count(distinct dates) from #tmp2) --number of the loop iterations
--delete data from the final results
delete from final_results
--fill the table with final results
--run the loop (each iteration = analyse of each 1-week range)
while @n<=@row
begin
with a as (
--run the loop
select client,max(dates) dates,sum (case when active = 1 then 1 else null end) sum_active,sum (case when active = 0 then 1 else null end) sum_inactive
from #tmp2
where dates between dateadd(day,-7 + @n,@min_date) and dateadd(day,-1 + @n,@min_date)
group by client
)
INSERT INTO [dbo].[final_results]
(client,[dates],[final_result])
select client,dates,case when sum_active = 7 then 1 --rule A
when sum_inactive = 7 then 0 -- rule B
else
(case when isnull(sum_active,0) + isnull(sum_inactive,0) < 7 then 0
else
(select final_result
from final_results b
where b.dates = dateadd(day,-1,a.dates)
and a.client= b.client) end
) end
from a
set @n=@n+1
end
if object_id(N'tempdb..#tmp2','U') is not null drop table #tmp2