问题描述
我有一个很长的查询,我想将其另存为存储过程,稍后在 etl 作业中使用它。它影响了大约 15+ 万行,需要 1 个半小时才能完成。我正在使用 postgres 和 pgadmin。
代码:
INSERT INTO t_temporary_id_table (
ref_date,id,client,slcunda,count_single_1,count_double_2,count_e,count_g,count_m,active,valid_till_max,id_2,created,lastmodified
)
with
cte_tmp as (
select
a.id,mm.tenant,c.slcunda,d.single,d.count_single,e.double_1,e.count_double_1,SUM (CASE WHEN id_role = 'E' THEN 1 ELSE 0 END) AS "count_e",SUM (CASE WHEN id_role = 'G' THEN 1 ELSE 0 END) AS "count_g",SUM (CASE WHEN id_role = 'M' THEN 1 ELSE 0 END) AS "count_m",case when min(status)='active' then 1 else 0 end active,MAX(valid_till) as valid_till_max
from schema1.struct a
inner join
(
select
id,max(valid_till) valid_till_max
from
schema1.struct a
group by
id
) b
on
a.id=b.id and a.valid_till = b.valid_till_max
left outer join
schema2.tenants mm on a.tsl_1_2 = mm.tenant
left outer join (
select
id,key_1 as slcunda
from
schema1.t_id
where
id in (Select id from schema1.t_id group by id having count(id)=1)
) c
on a.id=c.id
left outer join(
select
id,count_single,single
from (
select
id,id_2 as single,id_2_role,count(id) over(partition by id) as count_single,row_number() over(partition by id order by id,id_2_role desc) as rn
from
schema1.different_id_2
where
id_2_role in ('03','08','17')
) a
where rn=1
) d
on a.id=d.id
left outer join(
select
id,count_double_1,double_1
from (
select
id,id_2 as double_1,count(id) over(partition by id) as count_double_1,id_2_role desc) as rn
from
schema1.different_id_2
where
id_2_role in ('06','19')
) a
where rn=1
) e
on a.id=e.id
group by a.id,mm.client,e.count_double_1
),y as (
select *
from (
SELECT
cte_tmp.id,count_single as count_single_1,count_double_1 as count_double_2,b.id_2
FROM
cte_tmp
inner join (
select
id,id_2
from (
select
id,id_theory as id_2,row_number() over(partition by id) rn
from
schema1.struct
) a
where rn=1
) b
on cte_tmp.id=b.id
where
count_e=1 and count_g=0 and count_m=0 and count_single=0
union all
SELECT
id,single as id_2
FROM
cte_tmp
where
count_e=1 and count_g=0 and count_m=0 and count_single>=1
union all
SELECT
id,double_1 as id_2
FROM
cte_tmp
where
count_e=0 and count_g=1 and count_m>=1 and active=1 and count_double>=1
union all
SELECT
id,double_1 as id_2
FROM
cte_tmp
where
count_e<>1 and count_g<>0 and count_m<>=0 and active=0 and count_double>=1
) a
),z as (
SELECT
cte_tmp.id,valid_till_max
FROM cte_tmp
except
select
id,valid_till_max
from y
),temporary_result as (
select
id::bigint,'' as id_2
from z
union all
select
id::bigint,id_2
from y
)
select
Now(),id_2::bigint,Now(),Now()
from temporary_result
我有索引
- schema1.struct 表列“id”和“valid_till”
- schema1.t_id 表列 'id'
- schema1.different_id_2 表列“id”和“id_2_role”
我是新手,所以任何建议都将不胜感激。
解释查询结果如下:
# Node Rows Loops
实际
- 聚合(行=13953682 循环=1)13953682 1
- 排序(行=15791738 循环=1)15791738 1
- 散列左连接(行=15791738 循环=1) 哈希条件:(a.id = a_2.id) 15791738 1
- 散列左连接(行=15791738 循环=1) 哈希条件:(a.id = a_1.id) 15791738 1
- 哈希右连接(行=15791738 循环=1) 哈希条件:(t_id.id = a.id) 15791738 1
- 哈希内连接(行=60629 循环=1) 哈希条件:(t_id.id = t_id_1.id) 60629 1
- 将 t_id 的 Seq 扫描为 t_id(行 = 45144181 循环 = 1)45144181 1
- 哈希(行=60629 循环=1) 存储桶:131072 批次:2 内存使用:2241 kB 60629 1
- 聚合(行=60629 循环=1) 过滤器:(计数(t_id_1.id)= 1) 过滤器删除的行:15056381 60629 1
- 收集合并(行 = 31764065 循环 = 1)31764065 1
- 聚合(行=10588022 循环=3)10588022 3
- 排序(行=15048060 循环=3)15048060 3
- 将 t_id 上的 Seq 扫描为 t_id_1(行 = 15048060 循环 = 3)15048060 3
- 哈希(行=15791738 循环=1) 存储桶:65536 批次:512 内存使用:3585 kB 15791738 1
- 收集(行=15791738 循环=1)15791738 1
- 散列左连接(行=5263913 循环=3) 哈希条件:(a.tsl_1_2 = (mm.tenant)::numeric) 5263913 3
- 哈希内连接(行=5263913 循环=3) 哈希条件:((a.id = struct.id) AND (a.valid_till = (max(struct.valid_till)))) 5263913 3
- 对结构进行 Seq 扫描(行 = 5292460 循环 = 3)5292460 3
- 哈希(行=13953682 循环=3) 存储桶:131072 批次:256 内存使用:3575 kB 13953682 3
- 聚合(行=13953682 循环=3)13953682 3
- 排序(行=15877381 循环=3)15877381 3
- Seq Scan on struct as struct (rows=15877381 loops=3) 15877381 3
- 哈希(行=54 循环=3) 存储桶:1024 批次:1 内存使用:11 kB 54 3
- 以 mm 为单位对租户进行 Seq 扫描(行 = 54 个循环 = 3)54 3
- 哈希(行=7983 循环=1) 存储桶:8192 批次:1 内存使用量:634 kB 7983 1
- 子查询扫描(行=7983 循环=1) 过滤器:(a_1.rn = 1) 过滤器删除的行数:12669 7983 1
- 排序(行=20652 循环=1)20652 1
- 窗口聚合(行=20652 循环=1)20652 1
- 窗口聚合(行=20652 循环=1)20652 1
- 排序(行=20652 循环=1)20652 1
- 收集(行 = 20652 循环 = 1)20652 1
- 对 different_id_2 进行 Seq 扫描为 different_id_2(行 = 6884 循环 = 3) 过滤器:(id_2_role = ANY ('{3,8,17}'::numeric[])) 过滤器删除的行:1798703 6884 3
- 哈希(行=1815522 循环=1) 存储桶:65536 批次:64 内存使用:3585 kB 1815522 1
- 子查询扫描(行=1815522 循环=1) 过滤器:(a_2.rn = 1) 过滤器删除的行:3589410 1815522 1
- 排序(行=5404932 循环=1)5404932 1
- 窗口聚合(行=5404932 循环=1)5404932 1
- 窗口聚合(行=5404932 循环=1)5404932 1
- 排序(行=5404932 循环=1)5404932 1
- 对 different_id_2 进行 Seq 扫描为 different_id_2_1(行 = 5404932 循环 = 1) 过滤器:(id_2_role = ANY ('{6,19}'::numeric[])) 过滤器删除的行数:11829 5404932 1
但我不太明白!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)