如何在 Postgres 中使用具有多个分位数的 percentile_conts

问题描述

我目前有一个这样的查询

select AVG(t2 - t1) as delay,percentile_cont(0.25) within group (order by (t2 - t1)) as q25,percentile_cont(0.5) within group (order by (t2 - t1)) as median,percentile_cont(0.75) within group (order by (t2 - t1)) as q75,p.bool1,p.cat1
from people p
group by p.bool1,p.cat1
order by p.cat1,p.bool1

但是,我在 postgres 函数聚合页面上读到: https://www.postgresql.org/docs/9.4/functions-aggregate.html

我应该能够指定多个分位数:

percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression)  double precision[]  double precision or interval    array of sort expression's type     multiple continuous percentile: returns an array of results matching the shape of the fractions parameter,with each non-null element replaced by the value corresponding to that percentile

我想使用它,所以我不会为每个分位数重新计算 t2 - t1。获得多个分位数的正确语法是什么?我需要一个查询吗?

解决方法

我想使用它,所以我不会为每个分位数重新计算 t2 - t1

横向连接可以在这种情况下提供帮助:

select AVG(t2 - t1) as delay,percentile_cont(0.25) within group (order by s.col) as q25,percentile_cont(0.5) within group (order by s.col) as median,percentile_cont(0.75) within group (order by s.col) as q75,p.bool1,p.cat1
from people p,LATERAL(SELECT t2 - t1 AS col) s
group by p.bool1,p.cat1
order by p.cat1,p.bool1;

相关:PostgreSQL: using a calculated column in the same query


数组定义为:ARRAY[0.25,0.5,0.75]'{0.25,0.75}'::double precision[]

select AVG(t2 - t1) as delay,-- 1
   percentile_cont(ARRAY[0.25,0.75]) within group (order by (t2 - t1)) as q25,-- 2
   percentile_cont('{0.25,0.75}'::double precision[]) 
   within group (order by (t2 - t1)) as q
       p.bool1,p.cat1
from people p
group by p.bool1,p.bool1;

db<>fiddle demo


是否有一种简单的方法可以像使用 q25、q50、q75 等那样内联指定每个结果百分位字段的名称

WITH cte AS (
    select AVG(t2 - t1) as delay,percentile_cont(ARRAY[0.25,0.75]) within group (order by (t2 - t1)) as q,p.cat1
    from people p
    group by p.bool1,p.cat1
)
select cte.*,q[1] AS q25,q[2] AS q50,q[3] AS q75
from cte
order by cat1,bool1;

db<>fiddle demo2