问题描述
我的数据看起来像这样
+------+--------+------+-------+
| year | month | name | value |
+------+--------+------+-------+
| 2017 | 1 | John | 100 |
| 2017 | 2 | Doe | 200 |
| 2017 | 3 | Jane | 300 |
| . | . | . | . |
| 2018 | 1 | John | 150 |
| 2018 | 2 | Doe | 250 |
| 2018 | 3 | Jane | 350 |
+------+--------+------+-------+
我正在尝试计算每年和每月前2个名字的平均值。我可以使用下面的代码
select year,month,avg(sum_value) as avg_of_2
from (
select year,name,sum(value) as sum_value,rank() over (partition by year,month order by sum(value) desc) as rnk
from database.table_a
group by year,name
order by 1,2,4 desc
) tbl_for_2
where rnk <= 2 -- for top 2 values
group by 1,2
order by 1,2;
但是现在我想将平均值从前2个名称扩展到前5、10和50。有没有办法可以在不重复相同查询的情况下使用等级来实现?
我的最终结果将是
+------+-------+----------+----------+---------+
| year | month | avg_2 | avg_5 | avg_10 |
+------+-------+----------+----------+---------+
| 2017 | 1 | some_val | some_val | som_val |
| 2017 | 2 | some_val | some_val | som_val |
| .. | | | | |
| .. | | | | |
+------+-------+----------+----------+---------+
解决方法
没有办法返回动态列数,必须写所有列。 只需在外部查询中使用过滤的聚合:
select year,month,AVG(sum_value)FILTER(WHERE rnk<=2) as avg_2,AVG(sum_value)FILTER(WHERE rnk<=5) as avg_5,AVG(sum_value)FILTER(WHERE rnk<=10) as avg_10,.................
AVG(sum_value)FILTER(WHERE rnk<=100) as avg_100,... and so on
from (
select year,name,sum(value) as sum_value,rank() over (partition by year,month order by sum(value) desc) as rnk
from database.table_a
group by year,name
) tbl
group by 1,2
order by 1,2;