如何在 SQL 中聚合列并计算累积总和R 中的 mutate() 和 cumsum()

问题描述

数据如下:

road_id 计数 厘米
road111 123 502
road123 345 234
road124 3256 23498

我的目的是计算“流量公里流量”,计算每一行的百分比,按计数排序,计算累计总和。我知道如何在 R 中执行此操作,但很难在 sql 中重现。

为了实现这一点,我在 R 中的代码是(它工作正常):

data_perc <- data %>% 
             mutate(kilometers = centimeters/100000) %>%
             mutate(vkt = count*kilometers) %>%
             mutate(vktperc = vkt/sum(vkt)) %>%
             arrange(desc(count)) %>%
             mutate(vktcumsum = cumsum(vktperc))

sql 中,我试过这个,但有错误

select road_id,count,geom,centimeters/100000 as kilometers,cap1.y as vkt,cap2.z as vktperc,sum(vktperc) as vktcumsum
from roaddata rd1
inner join roaddata rd2 on rd1.road_id >= rd2.road_id
cross apply (select count*kilometers as product) cap1(y)
cross apply (select (vkt/sum(vkt))*100 as percentage) cap2(z)
group by road_id,count
order by count desc;

解决方法

如果我理解正确,你想要这样的东西:

select rd.*,centimeters / 100000 as kilometers,count * centimeters / 100000 as vkt,( sum(count * centimeters / 100000) over (order by count) /
         sum(count * centimeters / 100000) over ()
       ) as vktcumsum      
from roaddata rd;

但是如果没有结果和清晰的解释(不是代码!),很难知道这是否真的是你想要做的。