问题描述
我需要进行查询,以返回从录音开始到特定日期的所有未完成或取消的请求。我现在的操作方式需要花费大量时间并返回错误:“用户查询可能需要查看必须删除的行版本”(我猜这是由于缺少RAM)。
下面是我正在使用的查询,下面是一些信息:
-
T1,其中保存了每个新条目,以及ID,创建日期,状态(打开,关闭)和其他用于多个表的键。
-
T2,其中保存了每个请求中的每个更改(进行中,等待,拒绝和关闭),更改日期和其他表的其他键。
SELECT T1.id_request,T1.dt_created,T1.status FROM T1 LEFT JOIN T2 ON T1.id_request = T2.id_request WHERE (T1.dt_created >= '2012-01-01 00:00:00' AND T1.dt_created <= '2020-05-31 23:59:59') AND T1.id_request NOT IN (SELECT T2.di_request FROM T2 WHERE ((T2.dt_change >= '2012-01-01 00:00:00' AND T2.dt_change <= '2020-05-31 23:59:59') OR T2.dt_change IS NULL) AND T2.status IN ('Closed','Canceled','rejected'))
我的想法是得到所有收到的信息-T1(我不能只检索未结清的东西,它只适用于今天,而不是特定的过去日期-我想要的东西),介于记录开始和开始之间。可以说5月底。然后使用WHERE T1.ID NOT IN(在同一时间段中状态为“关闭”的T2.ID)。但是,正如我所说的,这将永远花费并返回错误。
我使用相同的代码来获取特定月份(1月至30日)的开放时间,并且工作正常。
也许这种方法不是最好的方法,但是我没有想到其他任何方法(我不是SQL专家)。如果没有足够的信息来提供答案,那就可以问了。
根据@MikeOrganek的请求,以下是分析器:
Nested Loop Left Join (cost=27985.55..949402.48 rows=227455 width=20) (actual time=2486.433..54832.280 rows=47726 loops=1)
Buffers: shared hit=293242 read=260670
Seq Scan on T1 (cost=27984.99..324236.82 rows=73753 width=20) (actual time=2467.499..6202.970 rows=16992 loops=1)
Filter: ((dt_created >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_created <= '2020-05-31 23:59:59-03'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
Rows Removed by Filter: 6085779
Buffers: shared hit=188489 read=250098
SubPlan 1
Nested Loop (cost=7845.36..27983.13 rows=745 width=4) (actual time=129.379..1856.518 rows=168690 loops=1)
Buffers: shared hit=60760
Seq Scan on T3(cost=0.00..5.21 rows=3 width=8) (actual time=0.057..0.104 rows=3 loops=1)
Filter: ((status_request)::text = ANY ('{Closed,Canceled,rejected}'::text[]))
Rows Removed by Filter: 125
Buffers: shared hit=7
Bitmap Heap Scan on T2(cost=7845.36..9321.70 rows=427 width=8) (actual time=477.324..607.171 rows=56230 loops=3)
Recheck Cond: ((dt_change >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_change <= '2020-05-31 23:59:59-03'::timestamp with time zone) AND (T2.ID_status= T3.ID_status))
Rows Removed by Index Recheck: 87203
Heap Blocks: exact=36359
Buffers: shared hit=60753
BitmapAnd (cost=7845.36..7845.36 rows=427 width=0) (actual time=473.864..473.864 rows=0 loops=3)
Buffers: shared hit=24394
Bitmap Index Scan on idx_ix_T2_dt_change (cost=0.00..941.81 rows=30775 width=0) (actual time=47.380..47.380 rows=306903 loops=3)
Index Cond: ((dt_change >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_change<= '2020-05-31 23:59:59-03'::timestamp with time zone))
Buffers: shared hit=2523
Bitmap Index Scan on idx_T2_ID_status (cost=0.00..6895.49 rows=262724 width=0) (actual time=418.942..418.942 rows=2105165 loops=3)
Index Cond: (ID_status = T3.ID_status )
Buffers: shared hit=21871
Index Only Scan using idx_ix_T2_id_request on T2 (cost=0.56..8.30 rows=18 width=4) (actual time=0.369..2.859 rows=3 loops=16992)
Index Cond: (id_request = t17.id_request )
Heap Fetches: 44807
Buffers: shared hit=104753 read=10572
Planning time: 23.424 ms
Execution time: 54841.261 ms
这是与dt_change IS NULL
的主要区别:
Planning time: 34.320 ms
Execution time: 230683.865 ms
谢谢
解决方法
OR T2.dt_change is NULL
看起来非常昂贵,因为它使整体执行时间增加了五倍。
我唯一看到的选择是将not in
更改为not exists
,如下所示。
SELECT T1.id_request,T1.dt_created,T1.status
FROM T1
LEFT JOIN T2
ON T1.id_request = T2.id_request
WHERE T1.dt_created >= '2012-01-01 00:00:00'
AND T1.dt_created <= '2020-05-31 23:59:59'
AND NOT EXISTS (SELECT 1
FROM T2
WHERE id_request = T1.id_request
AND ( ( dt_change >= '2012-01-01 00:00:00'
AND dt_change <= '2020-05-31 23:59:59')
OR dt_change IS NULL)
AND status IN ('Closed','Canceled','rejected'))
但是我希望这只会给您带来一点点改善。您能否看到此更改有什么帮助?