问题描述
我正在处理一个包含以下高度重复性陈述的视图:
SELECT 10 as [TopN],(Select SUM(A) FROM (Select TOP 10 A FROM [dbo].TheTable ORDER BY A DESC) T) as A,(Select SUM(B) FROM (Select TOP 10 B FROM [dbo].TheTable ORDER BY B DESC) T) as B,(Select SUM(C) FROM (Select TOP 10 C FROM [dbo].TheTable ORDER BY C DESC) T) as C,(Select SUM(D) FROM (Select TOP 10 D FROM [dbo].TheTable ORDER BY D DESC) T) as D,(Select SUM(E) FROM (Select TOP 10 E FROM [dbo].TheTable ORDER BY E DESC) T) as E
UNION ALL
SELECT 100 as [TopN],(Select SUM(A) FROM (Select TOP 100 A FROM [dbo].TheTable ORDER BY A DESC) T) as A,(Select SUM(B) FROM (Select TOP 100 B FROM [dbo].TheTable ORDER BY B DESC) T) as B,(Select SUM(C) FROM (Select TOP 100 C FROM [dbo].TheTable ORDER BY C DESC) T) as C,(Select SUM(D) FROM (Select TOP 100 D FROM [dbo].TheTable ORDER BY D DESC) T) as D,(Select SUM(E) FROM (Select TOP 100 E FROM [dbo].TheTable ORDER BY E DESC) T) as E
UNION ALL
SELECT 1000 as [TopN],(Select SUM(A) FROM (Select TOP 1000 A FROM [dbo].TheTable ORDER BY A DESC) T) as A,(Select SUM(B) FROM (Select TOP 1000 B FROM [dbo].TheTable ORDER BY B DESC) T) as B,(Select SUM(C) FROM (Select TOP 1000 C FROM [dbo].TheTable ORDER BY C DESC) T) as C,(Select SUM(D) FROM (Select TOP 1000 D FROM [dbo].TheTable ORDER BY D DESC) T) as D,(Select SUM(E) FROM (Select TOP 1000 E FROM [dbo].TheTable ORDER BY E DESC) T) as E
UNION ALL
--etc...
-- The same 7 lines of code repeated dozens of times for different values of `TopN`
为了为每个列生成top-value summations表。
这是表格的样子:
| TOPN | A | B | C | D | E |
| 10 | 234 | ...
| 100 | 734 | ...
| 1000 | 1298 | ...
| ... | ... | ...
为什么需要此查询?
在现实世界中,诸如此类的摘要报告可以回答以下问题:
- “ A”列中的“收入最高的10个收入者的总收入是多少”,“收入最高的100个收入者的总收入是什么”等...
- “ B”栏中的“前10名债务持有人的总债务是多少”,等等。
以此类推。每列都是基于该列的“独立”顺序的报告。因此,上表是最终用户可交付成果。
我在寻找什么?
上述查询的一种版本,可以是以下任意一种:更简单,重复数据删除,更有效,更可维护。
上面的查询工作正常,并生成了我上面已模拟的所需表。但是显然这是低效。例如,有人手动执行此操作将能够:
- 按每列一次排序
- 开始求和,直到达到第10个排序项,然后将总数吐出
- 继续求和,直到他们达到排序的第100位,然后吐出总计
- 等等...(例如,无需重新排序或再次从元素1重新开始)。
上述查询也是重复的-例如,如果这是一个存储过程,则可以遍历值列表(10、100、1000等)并生成此表。一次使用一行参数化代码(如下面的@Larnu的answer)每次一行。但是,视图不支持这种方法。由于当前的实现方式为View,因此如果将其转换为必须以不同方式执行(因为必须修改所有现有用法)的存储proc或函数,则将其视为回归。
因此,我想问的只是是否有任何方法可以使它变得更好。
我的想法
理想情况下,我可以内联值列表,例如:
Select * From (VALUES((10),(100),(1000),(5000),...)) AS TOPN_VALUES(TOPN)
或者我很高兴在一个简单的1列表中的某个位置捕获这些值。
无论哪种方式,剩下的就是需要(可能)一些巧妙的联接或交叉应用逻辑来从该数字列表中生成所有上述表条目,而不是将这些数字都硬编码在数十个小代码中复制粘贴的Select语句,就像它们在原始查询中一样。
对我来说清楚的一件事是,SELECT TOP X
无法在视图中进行参数化,因此至少,我们将不得不以另一种方式重新实现该逻辑。一种可能的解决方案是重写:
Select SUM(A) FROM (Select TOP 10 A FROM [dbo].soMetable ORDER BY A DESC)
为:
Select SUM(A) FROM (Select A,ROW_NUMBER() over (order by A desc) as [Rank])
WHERE [RANK] < 10
此时,上面的“ 10”可以由另一个表中的值动态确定。尽管要获得完整的解决方案,这里还有许多工作要做……
感谢您的帮助
解决方法
如果我在两行之间正确读取,请使用内联表值函数:
CREATE FUNCTION dbo.YourFunction (@Top int)
RETURNS table
AS RETURN
SELECT @Top AS [TopN],(SELECT SUM(A) FROM (Select TOP (@Top) A FROM [dbo].SomeTable ORDER BY A DESC) T) as A,(SELECT SUM(B) FROM (Select TOP (@Top) B FROM [dbo].SomeTable ORDER BY B DESC) T) as B,(SELECT SUM(C) FROM (Select TOP (@Top) C FROM [dbo].SomeTable ORDER BY C DESC) T) as C,(SELECT SUM(D) FROM (Select TOP (@Top) D FROM [dbo].SomeTable ORDER BY D DESC) T) as D,(SELECT SUM(E) FROM (Select TOP (@Top) E FROM [dbo].SomeTable ORDER BY E DESC) T) as E;
GO
然后您只需调用如下函数:
SELECT *
FROM dbo.YourFunction(10);