从特定日期返回尚未完成的所有请求PostgreSQL

问题描述

我需要进行查询,以返回从录音开始到特定日期的所有未完成或取消的请求。我现在的操作方式需要花费大量时间并返回错误:“用户查询可能需要查看必须删除的行版本”(我猜这是由于缺少RAM)。

下面是我正在使用的查询,下面是一些信息:

  • T1,其中保存了每个新条目,以及ID,创建日期,状态(打开,关闭)和其他用于多个表的键。

  • T2,其中保存了每个请求中的每个更改(进行中,等待,拒绝和关闭),更改日期和其他表的其他键。

    SELECT T1.id_request,T1.dt_created,T1.status
    FROM T1
    LEFT JOIN T2
        ON T1.id_request = T2.id_request
    WHERE (T1.dt_created >= '2012-01-01 00:00:00' AND T1.dt_created <= '2020-05-31 23:59:59')
        AND T1.id_request NOT IN (SELECT T2.di_request
                                  FROM T2
                                  WHERE ((T2.dt_change >= '2012-01-01 00:00:00' 
                                         AND T2.dt_change <= '2020-05-31 23:59:59')
                                         OR T2.dt_change IS NULL)
                                         AND T2.status IN ('Closed','Canceled','rejected'))
    

我的想法是得到所有收到的信息-T1(我不能只检索未结清的东西,它只适用于今天,而不是特定的过去日期-我想要的东西),介于记录开始和开始之间。可以说5月底。然后使用WHERE T1.ID NOT IN(在同一时间段中状态为“关闭”的T2.ID)。但是,正如我所说的,这将永远花费并返回错误。

我使用相同的代码来获取特定月份(1月至30日)的开放时间,并且工作正常。

也许这种方法不是最好的方法,但是我没有想到其他任何方法(我不是SQL专家)。如果没有足够的信息来提供答案,那就可以问了。

根据@MikeOrganek的请求,以下是分析器:

  Nested Loop Left Join  (cost=27985.55..949402.48 rows=227455 width=20) (actual time=2486.433..54832.280 rows=47726 loops=1)
   Buffers: shared hit=293242 read=260670
    Seq Scan on T1 (cost=27984.99..324236.82 rows=73753 width=20) (actual time=2467.499..6202.970 rows=16992 loops=1)
  Filter: ((dt_created >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_created <= '2020-05-31 23:59:59-03'::timestamp with time zone) AND (NOT (hashed SubPlan 1)))
   Rows Removed by Filter: 6085779
    Buffers: shared hit=188489 read=250098
  SubPlan 1
   Nested Loop  (cost=7845.36..27983.13 rows=745 width=4) (actual time=129.379..1856.518 rows=168690 loops=1)
    Buffers: shared hit=60760
     Seq Scan on T3(cost=0.00..5.21 rows=3 width=8) (actual time=0.057..0.104 rows=3 loops=1)
     Filter: ((status_request)::text = ANY ('{Closed,Canceled,rejected}'::text[]))
     Rows Removed by Filter: 125
     Buffers: shared hit=7
     Bitmap Heap Scan on T2(cost=7845.36..9321.70 rows=427 width=8) (actual time=477.324..607.171 rows=56230 loops=3)
     Recheck Cond: ((dt_change >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_change <= '2020-05-31 23:59:59-03'::timestamp with time zone) AND (T2.ID_status= T3.ID_status))
     Rows Removed by Index Recheck: 87203
     Heap Blocks: exact=36359
     Buffers: shared hit=60753
      BitmapAnd  (cost=7845.36..7845.36 rows=427 width=0) (actual time=473.864..473.864 rows=0 loops=3)
      Buffers: shared hit=24394
      Bitmap Index Scan on idx_ix_T2_dt_change (cost=0.00..941.81 rows=30775 width=0) (actual time=47.380..47.380 rows=306903 loops=3)
      Index Cond: ((dt_change >= '2020-05-01 00:00:00-03'::timestamp with time zone) AND (dt_change<= '2020-05-31 23:59:59-03'::timestamp with time zone))
      Buffers: shared hit=2523
      Bitmap Index Scan on idx_T2_ID_status  (cost=0.00..6895.49 rows=262724 width=0) (actual time=418.942..418.942 rows=2105165 loops=3)
      Index Cond: (ID_status = T3.ID_status )
      Buffers: shared hit=21871
    Index Only Scan using idx_ix_T2_id_request  on T2  (cost=0.56..8.30 rows=18 width=4) (actual time=0.369..2.859 rows=3 loops=16992)
    Index Cond: (id_request = t17.id_request )
    Heap Fetches: 44807
    Buffers: shared hit=104753 read=10572
    Planning time: 23.424 ms
    Execution time: 54841.261 ms

这是与dt_change IS NULL的主要区别:

  Planning time: 34.320 ms
  Execution time: 230683.865 ms

谢谢

解决方法

OR T2.dt_change is NULL看起来非常昂贵,因为它使整体执行时间增加了五倍。

我唯一看到的选择是将not in更改为not exists,如下所示。

SELECT T1.id_request,T1.dt_created,T1.status
  FROM T1
       LEFT JOIN T2
              ON T1.id_request = T2.id_request
 WHERE T1.dt_created >= '2012-01-01 00:00:00' 
   AND T1.dt_created <= '2020-05-31 23:59:59'
   AND NOT EXISTS (SELECT 1
                     FROM T2
                    WHERE id_request = T1.id_request
                      AND (   (    dt_change >= '2012-01-01 00:00:00' 
                               AND dt_change <= '2020-05-31 23:59:59')
                           OR dt_change IS NULL)
                      AND status IN ('Closed','Canceled','rejected'))

但是我希望这只会给您带来一点点改善。您能否看到此更改有什么帮助?

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...