PostgreSQL 在某些环境中不会在文本连接列上使用索引

问题描述

我的数据库开发工作目前涉及三个环境：Docker 上的本地 Postgresql（kartoza/postgis:11.5-2.5 映像），以及开发和生产环境，即 AWS 上的 aurora Postgresql（db.r5.xlarge 实例，版本 3.1，兼容使用 Postgresql 11.6)。 Dev 和 prod 本质上是平等的。 DB 模式通过迁移处理，数据加载是自动化的。撇开错误不谈，dev 和 prod 本质上是相等的。本地尽可能接近开发和生产。

我有一个查询，通过 FK (F -> S) 从表 F.s_id 连接到表 S.s_id。在我的本地和生产环境中，查询执行得足够好，但在开发中它恰好是 x3 慢。看到执行计划后，问题是join。在 dev 和 prod 中，索引用于位图堆扫描：

->  Bitmap Heap Scan on f f  (cost=769.47..28650.46 rows=15911 width=94) (actual time=93.371..93.889 rows=223 loops=19)
      Recheck Cond: (s_id = s_2.s_id)
      Filter: ((split_part(split_part(region_id,'-'::text,2),'.'::text,1) ~~ 'MX%'::text) AND (((COALESCE(f_date,r_date) >= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(f_date,r_date) <= '2020-12-31 23:59:59'::timestamp without time zone)) OR ((COALESCE(f_date,r_date) >= '2020-12-31 23:59:59'::timestamp without time zone) AND (COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR (((f_date IS NULL) OR (f_date_end IS NOT NULL)) AND (COALESCE(f_date,r_date_end) IS NOT NULL) AND (COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(f_date_end,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
      Rows Removed by Filter: 603 
      Heap Blocks: exact=10146
      ->  Bitmap Index Scan on f_s_id_idx  (cost=0.00..765.49 rows=36645 width=0) (actual time=0.931..0.931 rows=1587 loops=19)
            Index Cond: (s_id = s_2.s_id)

尽管如此，在 dev 中，会执行 Seq 扫描：

->  Hash Join  (cost=0.58..694942.27 rows=285254 width=96) (actual time=9910.557..9922.318 rows=4228 loops=1)
      Hash Cond: (f.s_id = s_2.s_id)
      ->  Seq Scan on f f  (cost=0.00..672953.32 rows=5102885 width=94) (actual time=0.160..9556.369 rows=4201870 loops=1)
            Filter: ((split_part(split_part(region_id,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
            Rows Removed by Filter: 7541863
      ->  Hash  (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
            Buckets: 1024  Batches: 1  Memory Usage: 10kB
            ->  CTE Scan on allowed_ss s_2  (cost=0.00..0.36 rows=18 width=64) (actual time=0.044..0.060 rows=19 loops=1

我试过了...

VACUUM ANALYZE 以确保更新统计信息。
CLUSTER 由 f_s_id_idx 索引，因此该索引的存储是最佳的
删除并重新创建 f_s_id_idx。

无论我尝试什么，该索引都没有在 dev 中使用，甚至没有禁用 seq 扫描。禁用 seq 扫描后，改为执行复杂日期检查的索引：

->  Hash Join  (cost=303647.45..808516.29 rows=285254 width=96) (actual time=5208.611..5214.111 rows=4228 loops=1)
      Hash Cond: (f.s_id = s_2.s_id)
      ->  Bitmap Heap Scan on f f  (cost=303646.86..786527.35 rows=5102885 width=94) (actual time=1513.315..4836.337 rows=4201870 loops=1)
            Recheck Cond: (((COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR ((COALESCE(f_date,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))) 
            Filter: ((split_part(split_part(region_id,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
            Rows Removed by Filter: 4341
            Heap Blocks: exact=117252
            ->  BitmapOr  (cost=303646.86..303646.86 rows=5895338 width=0) (actual time=1489.570..1489.571 rows=0 loops=1)
                  ->  Bitmap Index Scan on f_dates_idx  (cost=0.00..103966.71 rows=4092615 width=0) (actual time=653.589..653.589 rows=4101547 loops=1)
                        Index Cond: ((COALESCE(f_date,r_date) <= '2020-12-31 23:59:59'::timestamp without time zone))
                  ->  Bitmap Index Scan on f_dates_idx  (cost=0.00..1495.75 rows=58719 width=0) (actual time=0.012..0.012 rows=0 loops=1)
                        Index Cond: ((COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone))
                  ->  Bitmap Index Scan on f_dates_idx  (cost=0.00..194357.24 rows=1744004 width=0) (actual time=835.968..835.968 rows=104702 loops=1)
                        Index Cond: ((COALESCE(f_date,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))
      ->  Hash  (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
            Buckets: 1024  Batches: 1  Memory Usage: 10kB
            ->  CTE Scan on allowed_ss s_2  (cost=0.00..0.36 rows=18 width=64) (actual time=0.051..0.060 rows=19 loops=1)

WORK_MEM 已经相当高了：256MB。该列的 STATISTICS 也相当高，为 1000，大于当前不同值的数量。

顺便说一句，这就是索引的创建方式：

create index f_s_id_idx on f(s_id)

该列还有另一个索引，其中包含用于 text_pattern_ops 搜索的 like-pattern，但添加了这个索引用于相等（索引中使用的比较）操作。

编辑：在@Laurenz 请求之后，这是快速查询的完整计划（使用实际名称，在前一个中我已将它们简化/匿名化）：

ash Join  (cost=617711.37..617726.42 rows=250 width=96) (actual time=1828.947..1828.959 rows=32 loops=1)
  Hash Cond: (r.region_id = sub2.region_id)
  CTE public_regions
    ->  Hash Join  (cost=22.47..29.07 rows=250 width=61) (actual time=1.034..1.136 rows=32 loops=1)
          Hash Cond: (r_1.source_id = s.source_id)
          ->  Index Only Scan using regions_region_id_index on regions r_1  (cost=0.56..4.58 rows=266 width=40) (actual time=0.013..0.056 rows=32 loops=1)
                Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
                Filter: (region_id ~~ 'TIMXST-MX%'::text)
                Heap Fetches: 32
          ->  Hash  (cost=17.11..17.11 rows=384 width=23) (actual time=1.009..1.009 rows=338 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 27kB
                ->  Seq Scan on sources s  (cost=0.00..17.11 rows=384 width=23) (actual time=0.473..0.968 rows=338 loops=1)
                      Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
                      Rows Removed by Filter: 3
  CTE ftvs
    ->  Function Scan on _two_dimensional_array_to_one_dimensional t  (cost=0.25..10.25 rows=1000 width=12) (actual time=0.107..0.110 rows=25 loops=1)
  CTE allowed_sources
    ->  Seq Scan on sources s_1  (cost=0.00..18.13 rows=21 width=31) (actual time=0.029..0.111 rows=19 loops=1)
          Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[]))) AND (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
          Rows Removed by Filter: 322
  CTE filtered_facts
    ->  GroupAggregate  (cost=611430.07..616233.39 rows=41768 width=112) (actual time=1807.723..1812.385 rows=2421 loops=1)
          Group Key: f.region_id,f.fact_type_id,f.fact_type_version,f.fact_subtype_id,s_2.provider_id
          ->  Sort  (cost=611430.07..611534.49 rows=41768 width=96) (actual time=1807.690..1808.326 rows=4228 loops=1)
                Sort Key: f.region_id,s_2.provider_id
                Sort Method: quicksort  Memory: 787kB
                ->  Hash Join  (cost=802.47..608224.35 rows=41768 width=96) (actual time=465.740..1787.072 rows=4228 loops=1)
                      Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
                      ->  nested Loop  (cost=769.47..605001.34 rows=334141 width=96) (actual time=465.603..1785.045 rows=4228 loops=1)
                            ->  CTE Scan on allowed_sources s_2  (cost=0.00..0.42 rows=21 width=64) (actual time=0.030..0.174 rows=19 loops=1)
                            ->  Bitmap Heap Scan on facts f  (cost=769.47..28650.46 rows=15911 width=94) (actual time=93.371..93.889 rows=223 loops=19)
                                  Recheck Cond: (source_id = s_2.source_id)
                                  Filter: ((split_part(split_part(region_id,1) ~~ 'MX%'::text) AND (((COALESCE(fact_date,reported_date) >= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(fact_date,reported_date) <= '2020-12-31 23:59:59'::timestamp without time zone)) OR ((COALESCE(fact_date,reported_date) >= '2020-12-31 23:59:59'::timestamp without time zone) AND (COALESCE(fact_date,reported_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR (((fact_date IS NULL) OR (fact_date_end IS NOT NULL)) AND (COALESCE(fact_date,reported_date_end) IS NOT NULL) AND (COALESCE(fact_date,reported_date) <= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(fact_date_end,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
                                  Rows Removed by Filter: 603
                                  Heap Blocks: exact=10146
                                  ->  Bitmap Index Scan on facts_source_id_idx  (cost=0.00..765.49 rows=36645 width=0) (actual time=0.931..0.931 rows=1587 loops=19)
                                        Index Cond: (source_id = s_2.source_id)
                      ->  Hash  (cost=29.50..29.50 rows=200 width=12) (actual time=0.129..0.129 rows=25 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 10kB
                            ->  HashAggregate  (cost=27.50..29.50 rows=200 width=12) (actual time=0.122..0.125 rows=25 loops=1)
                                  Group Key: ftvs.t,ftvs.v,ftvs.s
                                  ->  CTE Scan on ftvs  (cost=0.00..20.00 rows=1000 width=12) (actual time=0.108..0.116 rows=25 loops=1)

顺便说一句，如果你看到了快速计划，那就是高估了行数。这可能是使用 seq 扫描的一个原因，因为其他地方的细微差别，但我不知道我还能做些什么来改善有关该列的统计数据。

完全慢速计划：

    Hash Join  (cost=705764.38..705772.86 rows=198 width=96) (actual time=9964.327..9964.344 rows=32 loops=1)
      Hash Cond: (r.region_id = acbp.region_id)
      CTE public_regions
        ->  Hash Join  (cost=20.82..27.42 rows=250 width=61) (actual time=0.155..0.191 rows=32 loops=1)
              Hash Cond: (r_1.source_id = s.source_id)
              ->  Index Only Scan using regions_region_id_index on regions r_1  (cost=0.56..4.58 rows=266 width=40) (actual time=0.017..0.027 rows=32 loops=1)
                    Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
                    Filter: (region_id ~~ 'TIMXST-MX%'::text)
                    Heap Fetches: 0
              ->  Hash  (cost=16.26..16.26 rows=320 width=23) (actual time=0.132..0.132 rows=338 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 27kB
                    ->  Seq Scan on sources s  (cost=0.00..16.26 rows=320 width=23) (actual time=0.006..0.083 rows=338 loops=1)
                          Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
                          Rows Removed by Filter: 3
      CTE ftvs
        ->  Function Scan on _two_dimensional_array_to_one_dimensional t  (cost=0.25..10.25 rows=1000 width=12) (actual time=0.124..0.128 rows=25 loops=1)
      CTE allowed_sources
        ->  Seq Scan on sources s_1  (cost=0.00..17.12 rows=18 width=31) (actual time=0.043..0.053 rows=19 loops=1)
              Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
              Rows Removed by Filter: 322
      CTE filtered_facts
        ->  GroupAggregate  (cost=700394.56..704495.12 rows=35657 width=112) (actual time=9942.762..9946.578 rows=2421 loops=1)
              Group Key: f.region_id,s_2.provider_id
              ->  Sort  (cost=700394.56..700483.70 rows=35657 width=96) (actual time=9942.731..9942.932 rows=4228 loops=1)
                    Sort Key: f.region_id,s_2.provider_id
                    Sort Method: quicksort  Memory: 787kB
                    ->  Hash Join  (cost=33.59..697698.55 rows=35657 width=96) (actual time=9910.733..9923.439 rows=4228 loops=1)
                          Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
                          ->  Hash Join  (cost=0.58..694942.27 rows=285254 width=96) (actual time=9910.557..9922.318 rows=4228 loops=1)
                                Hash Cond: (f.source_id = s_2.source_id)
                                ->  Seq Scan on facts f  (cost=0.00..672953.32 rows=5102885 width=94) (actual time=0.160..9556.369 rows=4201870 loops=1)
                                      Filter: ((split_part(split_part(region_id,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
                                      Rows Removed by Filter: 7541863
                                ->  Hash  (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                      ->  CTE Scan on allowed_sources s_2  (cost=0.00..0.36 rows=18 width=64) (actual time=0.044..0.060 rows=19 loops=1)
                          ->  Hash  (cost=29.50..29.50 rows=200 width=12) (actual time=0.157..0.157 rows=25 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                ->  HashAggregate  (cost=27.50..29.50 rows=200 width=12) (actual time=0.146..0.151 rows=25 loops=1)
                                      Group Key: ftvs.t,ftvs.s
                                      ->  CTE Scan on ftvs  (cost=0.00..20.00 rows=1000 width=12) (actual time=0.127..0.136 rows=25 loops=1)
      CTE fact_counts
        ->  GroupAggregate  (cost=1172.20..1177.10 rows=178 width=128) (actual time=9963.271..9963.843 rows=72 loops=1)
              Group Key: r3.region_id,ff.provider_id,(concat(ff.fact_type_id,'-',ff.fact_type_version,ff.fact_subtype_id))
              ->  Sort  (cost=1172.20..1172.65 rows=178 width=112) (actual time=9963.261..9963.368 rows=2421 loops=1)
                    Sort Key: r3.region_id,ff.fact_subtype_id))
                    Sort Method: quicksort  Memory: 286kB
                    ->  nested Loop Left Join  (cost=0.00..1165.55 rows=178 width=112) (actual time=9942.930..9961.788 rows=2421 loops=1)
                          Join Filter: (ff.region_id_natural_id ~~ r3.region_id_pattern)
                          Rows Removed by Join Filter: 75051
                          ->  CTE Scan on public_regions r3  (cost=0.00..6.25 rows=1 width=64) (actual time=0.157..0.206 rows=32 loops=1)
                                Filter: ("substring"(region_id,1,6) = 'TIMXST'::text)
                          ->  CTE Scan on filtered_facts ff  (cost=0.00..713.14 rows=35657 width=92) (actual time=310.711..311.022 rows=2421 loops=32)
      CTE counts_by_provider
        ->  HashAggregate  (cost=4.89..7.12 rows=178 width=96) (actual time=9963.974..9963.986 rows=33 loops=1)
              Group Key: fc.region_id,fc.provider_id
              ->  CTE Scan on fact_counts fc  (cost=0.00..3.56 rows=178 width=128) (actual time=9963.272..9963.871 rows=72 loops=1)
      CTE counts_by_type
        ->  HashAggregate  (cost=4.89..7.12 rows=178 width=96) (actual time=0.046..0.065 rows=71 loops=1)
              Group Key: fc_1.region_id,fc_1.fact_type
              ->  CTE Scan on fact_counts fc_1  (cost=0.00..3.56 rows=178 width=96) (actual time=0.000..0.006 rows=72 loops=1)
      CTE aggregated_counts_by_provider
        ->  HashAggregate  (cost=4.45..6.68 rows=178 width=64) (actual time=9964.070..9964.082 rows=32 loops=1)
              Group Key: cbp.region_id
              ->  CTE Scan on counts_by_provider cbp  (cost=0.00..3.56 rows=178 width=96) (actual time=9963.975..9963.998 rows=33 loops=1)
      ->  Hash Join  (cost=10.68..16.35 rows=222 width=96) (actual time=0.212..0.222 rows=32 loops=1)
            Hash Cond: (r.region_id = sub2.region_id)
            ->  CTE Scan on public_regions r  (cost=0.00..5.00 rows=250 width=32) (actual time=0.001..0.003 rows=32 loops=1)
            ->  Hash  (cost=8.46..8.46 rows=178 width=64) (actual time=0.203..0.203 rows=32 loops=1)
                  Buckets: 1024  Batches: 1  Memory Usage: 12kB
                  ->  Subquery Scan on sub2  (cost=4.45..8.46 rows=178 width=64) (actual time=0.177..0.192 rows=32 loops=1)
                        ->  HashAggregate  (cost=4.45..6.68 rows=178 width=64) (actual time=0.176..0.188 rows=32 loops=1)
                              Group Key: sub.region_id
                              ->  CTE Scan on counts_by_type sub  (cost=0.00..3.56 rows=178 width=96) (actual time=0.046..0.088 rows=71 loops=1)
      ->  Hash  (cost=3.56..3.56 rows=178 width=64) (actual time=9964.110..9964.110 rows=32 loops=1)
            Buckets: 1024  Batches: 1  Memory Usage: 13kB
            ->  CTE Scan on aggregated_counts_by_provider acbp  (cost=0.00..3.56 rows=178 width=64) (actual time=9964.072..9964.097 rows=32 loops=1)

更多编辑：之前的快速计划来自我本地。由于 prod 更接近 dev（它是 aurora），因此生产计划更重要。这是：

    Hash Join  (cost=713549.03..713557.81 rows=225 width=96) (actual time=56.873..56.889 rows=32 loops=1)
      Hash Cond: (r.region_id = acbp.region_id)
      CTE public_regions
        ->  Hash Join  (cost=21.17..27.77 rows=249 width=61) (actual time=0.151..0.200 rows=32 loops=1)
              Hash Cond: (r_1.source_id = s.source_id)
              ->  Index Only Scan using regions_region_id_index on regions r_1  (cost=0.56..4.58 rows=266 width=40) (actual time=0.013..0.034 rows=32 loops=1)
                    Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
                    Filter: (region_id ~~ 'TIMXST-MX%'::text)
                    Heap Fetches: 0
              ->  Hash  (cost=16.45..16.45 rows=333 width=23) (actual time=0.132..0.132 rows=338 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 27kB
                    ->  Seq Scan on sources s  (cost=0.00..16.45 rows=333 width=23) (actual time=0.005..0.083 rows=338 loops=1)
                          Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
                          Rows Removed by Filter: 3
      CTE ftvs
        ->  Function Scan on _two_dimensional_array_to_one_dimensional t  (cost=0.25..10.25 rows=1000 width=12) (actual time=0.070..0.074 rows=25 loops=1)
      CTE allowed_sources
        ->  Seq Scan on sources s_1  (cost=0.00..17.34 rows=19 width=31) (actual time=0.004..0.081 rows=19 loops=1)
              Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
              Rows Removed by Filter: 322
      CTE filtered_facts
        ->  GroupAggregate  (cost=707831.57..712200.08 rows=37987 width=112) (actual time=35.313..39.184 rows=2421 loops=1)
              Group Key: f.region_id,s_2.provider_id
              ->  Sort  (cost=707831.57..707926.54 rows=37987 width=96) (actual time=35.279..35.523 rows=4228 loops=1)
                    Sort Key: f.region_id,s_2.provider_id
                    Sort Method: quicksort  Memory: 787kB
                    ->  Hash Join  (cost=33.56..704942.05 rows=37987 width=96) (actual time=0.136..16.191 rows=4228 loops=1)
                          Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
                          ->  nested Loop  (cost=0.56..702007.78 rows=303897 width=96) (actual time=0.036..15.181 rows=4228 loops=1)
                                ->  CTE Scan on allowed_sources s_2  (cost=0.00..0.38 rows=19 width=64) (actual time=0.005..0.093 rows=19 loops=1)
                                ->  Index Scan using facts_source_id_idx on facts f  (cost=0.56..36787.81 rows=15995 width=94) (actual time=0.507..0.768 rows=223 loops=19)
                                      Index Cond: (source_id = s_2.source_id)
                                      Filter: ((split_part(split_part(region_id,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
                                      Rows Removed by Filter: 603
                          ->  Hash  (cost=29.50..29.50 rows=200 width=12) (actual time=0.095..0.095 rows=25 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                ->  HashAggregate  (cost=27.50..29.50 rows=200 width=12) (actual time=0.087..0.091 rows=25 loops=1)
                                      Group Key: ftvs.t,ftvs.s
                                      ->  CTE Scan on ftvs  (cost=0.00..20.00 rows=1000 width=12) (actual time=0.071..0.080 rows=25 loops=1)
      CTE fact_counts
        ->  GroupAggregate  (cost=1248.47..1253.69 rows=190 width=128) (actual time=55.806..56.378 rows=72 loops=1)
              Group Key: r3.region_id,ff.fact_subtype_id))
              ->  Sort  (cost=1248.47..1248.94 rows=190 width=112) (actual time=55.795..55.902 rows=2421 loops=1)
                    Sort Key: r3.region_id,ff.fact_subtype_id))
                    Sort Method: quicksort  Memory: 286kB
                    ->  nested Loop Left Join  (cost=0.00..1241.28 rows=190 width=112) (actual time=35.477..54.300 rows=2421 loops=1)
                          Join Filter: (ff.region_id_natural_id ~~ r3.region_id_pattern)
                          Rows Removed by Join Filter: 75051
                          ->  CTE Scan on public_regions r3  (cost=0.00..6.23 rows=1 width=64) (actual time=0.153..0.215 rows=32 loops=1)
                                Filter: ("substring"(region_id,6) = 'TIMXST'::text)
                          ->  CTE Scan on filtered_facts ff  (cost=0.00..759.74 rows=37987 width=92) (actual time=1.104..1.417 rows=2421 loops=32)
      CTE counts_by_provider
        ->  HashAggregate  (cost=5.23..7.60 rows=190 width=96) (actual time=56.520..56.532 rows=33 loops=1)
              Group Key: fc.region_id,fc.provider_id
              ->  CTE Scan on fact_counts fc  (cost=0.00..3.80 rows=190 width=128) (actual time=55.807..56.408 rows=72 loops=1)
      CTE counts_by_type
        ->  HashAggregate  (cost=5.23..7.60 rows=190 width=96) (actual time=0.044..0.064 rows=71 loops=1)
              Group Key: fc_1.region_id,fc_1.fact_type
              ->  CTE Scan on fact_counts fc_1  (cost=0.00..3.80 rows=190 width=96) (actual time=0.000..0.005 rows=72 loops=1)
      CTE aggregated_counts_by_provider
        ->  HashAggregate  (cost=4.75..7.12 rows=190 width=64) (actual time=56.620..56.633 rows=32 loops=1)
              Group Key: cbp.region_id
              ->  CTE Scan on counts_by_provider cbp  (cost=0.00..3.80 rows=190 width=96) (actual time=56.521..56.545 rows=33 loops=1)
      ->  Hash Join  (cost=11.40..17.04 rows=237 width=96) (actual time=0.206..0.215 rows=32 loops=1)
            Hash Cond: (r.region_id = sub2.region_id)
            ->  CTE Scan on public_regions r  (cost=0.00..4.98 rows=249 width=32) (actual time=0.001..0.003 rows=32 loops=1)
            ->  Hash  (cost=9.03..9.03 rows=190 width=64) (actual time=0.197..0.197 rows=32 loops=1)
                  Buckets: 1024  Batches: 1  Memory Usage: 12kB
                  ->  Subquery Scan on sub2  (cost=4.75..9.03 rows=190 width=64) (actual time=0.175..0.190 rows=32 loops=1)
                        ->  HashAggregate  (cost=4.75..7.12 rows=190 width=64) (actual time=0.174..0.186 rows=32 loops=1)
                              Group Key: sub.region_id
                              ->  CTE Scan on counts_by_type sub  (cost=0.00..3.80 rows=190 width=96) (actual time=0.045..0.085 rows=71 loops=1)
      ->  Hash  (cost=3.80..3.80 rows=190 width=64) (actual time=56.662..56.662 rows=32 loops=1)
            Buckets: 1024  Batches: 1  Memory Usage: 13kB
            ->  CTE Scan on aggregated_counts_by_provider acbp  (cost=0.00..3.80 rows=190 width=64) (actual time=56.622..56.644 rows=32 loops=1)

如您所见，filtered_facts 的读取甚至比本地更好，使用索引扫描而不是位图堆扫描。

解决方法

在快速系统上，索引扫描从 facts 扫描 19*(603+223) = 15694 行，找到 19*223 = 4237 行。在慢速系统上，会扫描 facts 的所有 7541863 + 4201870 = 11743733 行，找到 4201870 行。

如果返回的结果行的百分比足够高，顺序扫描是最有效的访问策略。

所以区别是因为数据库中的数据不同。我认为没有理由怀疑 PostgreSQL 在这里做错了。

Ey...我终于通过运行 ANALYZE 解决了这个问题。我从不独自跑 ANALYZE，我总是做 VACUUM ANALYZE。 AFAIK，这样做应该清理垃圾并更新统计信息。我在某处读到 Aurora 在运行真空时不会更新统计数据，所以我试了一下。也许某些东西被卡在引擎盖下并且统计数据没有正确更新:shrug:

amazon-aurora aws-aurora-serverless