问题描述
我的数据库开发工作目前涉及三个环境:Docker 上的本地 Postgresql(kartoza/postgis:11.5-2.5
映像),以及开发和生产环境,即 AWS 上的 aurora Postgresql(db.r5.xlarge
实例,版本 3.1,兼容使用 Postgresql 11.6)。 Dev 和 prod 本质上是平等的。 DB 模式通过迁移处理,数据加载是自动化的。撇开错误不谈,dev 和 prod 本质上是相等的。本地尽可能接近开发和生产。
我有一个查询,通过 FK (F
-> S
) 从表 F.s_id
连接到表 S.s_id
。在我的本地和生产环境中,查询执行得足够好,但在开发中它恰好是 x3 慢。看到执行计划后,问题是join。在 dev 和 prod 中,索引用于位图堆扫描:
-> Bitmap Heap Scan on f f (cost=769.47..28650.46 rows=15911 width=94) (actual time=93.371..93.889 rows=223 loops=19)
Recheck Cond: (s_id = s_2.s_id)
Filter: ((split_part(split_part(region_id,'-'::text,2),'.'::text,1) ~~ 'MX%'::text) AND (((COALESCE(f_date,r_date) >= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(f_date,r_date) <= '2020-12-31 23:59:59'::timestamp without time zone)) OR ((COALESCE(f_date,r_date) >= '2020-12-31 23:59:59'::timestamp without time zone) AND (COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR (((f_date IS NULL) OR (f_date_end IS NOT NULL)) AND (COALESCE(f_date,r_date_end) IS NOT NULL) AND (COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(f_date_end,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 603
Heap Blocks: exact=10146
-> Bitmap Index Scan on f_s_id_idx (cost=0.00..765.49 rows=36645 width=0) (actual time=0.931..0.931 rows=1587 loops=19)
Index Cond: (s_id = s_2.s_id)
尽管如此,在 dev 中,会执行 Seq 扫描:
-> Hash Join (cost=0.58..694942.27 rows=285254 width=96) (actual time=9910.557..9922.318 rows=4228 loops=1)
Hash Cond: (f.s_id = s_2.s_id)
-> Seq Scan on f f (cost=0.00..672953.32 rows=5102885 width=94) (actual time=0.160..9556.369 rows=4201870 loops=1)
Filter: ((split_part(split_part(region_id,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 7541863
-> Hash (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> CTE Scan on allowed_ss s_2 (cost=0.00..0.36 rows=18 width=64) (actual time=0.044..0.060 rows=19 loops=1
我试过了...
无论我尝试什么,该索引都没有在 dev 中使用,甚至没有禁用 seq 扫描。禁用 seq 扫描后,改为执行复杂日期检查的索引:
-> Hash Join (cost=303647.45..808516.29 rows=285254 width=96) (actual time=5208.611..5214.111 rows=4228 loops=1)
Hash Cond: (f.s_id = s_2.s_id)
-> Bitmap Heap Scan on f f (cost=303646.86..786527.35 rows=5102885 width=94) (actual time=1513.315..4836.337 rows=4201870 loops=1)
Recheck Cond: (((COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR ((COALESCE(f_date,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone)))
Filter: ((split_part(split_part(region_id,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 4341
Heap Blocks: exact=117252
-> BitmapOr (cost=303646.86..303646.86 rows=5895338 width=0) (actual time=1489.570..1489.571 rows=0 loops=1)
-> Bitmap Index Scan on f_dates_idx (cost=0.00..103966.71 rows=4092615 width=0) (actual time=653.589..653.589 rows=4101547 loops=1)
Index Cond: ((COALESCE(f_date,r_date) <= '2020-12-31 23:59:59'::timestamp without time zone))
-> Bitmap Index Scan on f_dates_idx (cost=0.00..1495.75 rows=58719 width=0) (actual time=0.012..0.012 rows=0 loops=1)
Index Cond: ((COALESCE(f_date,r_date) <= '2019-03-01 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on f_dates_idx (cost=0.00..194357.24 rows=1744004 width=0) (actual time=835.968..835.968 rows=104702 loops=1)
Index Cond: ((COALESCE(f_date,r_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))
-> Hash (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> CTE Scan on allowed_ss s_2 (cost=0.00..0.36 rows=18 width=64) (actual time=0.051..0.060 rows=19 loops=1)
WORK_MEM
已经相当高了:256MB。
该列的 STATISTICS
也相当高,为 1000,大于当前不同值的数量。
顺便说一句,这就是索引的创建方式:
create index f_s_id_idx on f(s_id)
该列还有另一个索引,其中包含用于 text_pattern_ops
搜索的 like-pattern
,但添加了这个索引用于相等(索引中使用的比较)操作。
编辑:在@Laurenz 请求之后,这是快速查询的完整计划(使用实际名称,在前一个中我已将它们简化/匿名化):
ash Join (cost=617711.37..617726.42 rows=250 width=96) (actual time=1828.947..1828.959 rows=32 loops=1)
Hash Cond: (r.region_id = sub2.region_id)
CTE public_regions
-> Hash Join (cost=22.47..29.07 rows=250 width=61) (actual time=1.034..1.136 rows=32 loops=1)
Hash Cond: (r_1.source_id = s.source_id)
-> Index Only Scan using regions_region_id_index on regions r_1 (cost=0.56..4.58 rows=266 width=40) (actual time=0.013..0.056 rows=32 loops=1)
Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
Filter: (region_id ~~ 'TIMXST-MX%'::text)
Heap Fetches: 32
-> Hash (cost=17.11..17.11 rows=384 width=23) (actual time=1.009..1.009 rows=338 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 27kB
-> Seq Scan on sources s (cost=0.00..17.11 rows=384 width=23) (actual time=0.473..0.968 rows=338 loops=1)
Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 3
CTE ftvs
-> Function Scan on _two_dimensional_array_to_one_dimensional t (cost=0.25..10.25 rows=1000 width=12) (actual time=0.107..0.110 rows=25 loops=1)
CTE allowed_sources
-> Seq Scan on sources s_1 (cost=0.00..18.13 rows=21 width=31) (actual time=0.029..0.111 rows=19 loops=1)
Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[]))) AND (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 322
CTE filtered_facts
-> GroupAggregate (cost=611430.07..616233.39 rows=41768 width=112) (actual time=1807.723..1812.385 rows=2421 loops=1)
Group Key: f.region_id,f.fact_type_id,f.fact_type_version,f.fact_subtype_id,s_2.provider_id
-> Sort (cost=611430.07..611534.49 rows=41768 width=96) (actual time=1807.690..1808.326 rows=4228 loops=1)
Sort Key: f.region_id,s_2.provider_id
Sort Method: quicksort Memory: 787kB
-> Hash Join (cost=802.47..608224.35 rows=41768 width=96) (actual time=465.740..1787.072 rows=4228 loops=1)
Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
-> nested Loop (cost=769.47..605001.34 rows=334141 width=96) (actual time=465.603..1785.045 rows=4228 loops=1)
-> CTE Scan on allowed_sources s_2 (cost=0.00..0.42 rows=21 width=64) (actual time=0.030..0.174 rows=19 loops=1)
-> Bitmap Heap Scan on facts f (cost=769.47..28650.46 rows=15911 width=94) (actual time=93.371..93.889 rows=223 loops=19)
Recheck Cond: (source_id = s_2.source_id)
Filter: ((split_part(split_part(region_id,1) ~~ 'MX%'::text) AND (((COALESCE(fact_date,reported_date) >= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(fact_date,reported_date) <= '2020-12-31 23:59:59'::timestamp without time zone)) OR ((COALESCE(fact_date,reported_date) >= '2020-12-31 23:59:59'::timestamp without time zone) AND (COALESCE(fact_date,reported_date) <= '2019-03-01 00:00:00'::timestamp without time zone)) OR (((fact_date IS NULL) OR (fact_date_end IS NOT NULL)) AND (COALESCE(fact_date,reported_date_end) IS NOT NULL) AND (COALESCE(fact_date,reported_date) <= '2019-03-01 00:00:00'::timestamp without time zone) AND (COALESCE(fact_date_end,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 603
Heap Blocks: exact=10146
-> Bitmap Index Scan on facts_source_id_idx (cost=0.00..765.49 rows=36645 width=0) (actual time=0.931..0.931 rows=1587 loops=19)
Index Cond: (source_id = s_2.source_id)
-> Hash (cost=29.50..29.50 rows=200 width=12) (actual time=0.129..0.129 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> HashAggregate (cost=27.50..29.50 rows=200 width=12) (actual time=0.122..0.125 rows=25 loops=1)
Group Key: ftvs.t,ftvs.v,ftvs.s
-> CTE Scan on ftvs (cost=0.00..20.00 rows=1000 width=12) (actual time=0.108..0.116 rows=25 loops=1)
顺便说一句,如果你看到了快速计划,那就是高估了行数。这可能是使用 seq 扫描的一个原因,因为其他地方的细微差别,但我不知道我还能做些什么来改善有关该列的统计数据。
完全慢速计划:
Hash Join (cost=705764.38..705772.86 rows=198 width=96) (actual time=9964.327..9964.344 rows=32 loops=1)
Hash Cond: (r.region_id = acbp.region_id)
CTE public_regions
-> Hash Join (cost=20.82..27.42 rows=250 width=61) (actual time=0.155..0.191 rows=32 loops=1)
Hash Cond: (r_1.source_id = s.source_id)
-> Index Only Scan using regions_region_id_index on regions r_1 (cost=0.56..4.58 rows=266 width=40) (actual time=0.017..0.027 rows=32 loops=1)
Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
Filter: (region_id ~~ 'TIMXST-MX%'::text)
Heap Fetches: 0
-> Hash (cost=16.26..16.26 rows=320 width=23) (actual time=0.132..0.132 rows=338 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 27kB
-> Seq Scan on sources s (cost=0.00..16.26 rows=320 width=23) (actual time=0.006..0.083 rows=338 loops=1)
Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 3
CTE ftvs
-> Function Scan on _two_dimensional_array_to_one_dimensional t (cost=0.25..10.25 rows=1000 width=12) (actual time=0.124..0.128 rows=25 loops=1)
CTE allowed_sources
-> Seq Scan on sources s_1 (cost=0.00..17.12 rows=18 width=31) (actual time=0.043..0.053 rows=19 loops=1)
Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 322
CTE filtered_facts
-> GroupAggregate (cost=700394.56..704495.12 rows=35657 width=112) (actual time=9942.762..9946.578 rows=2421 loops=1)
Group Key: f.region_id,s_2.provider_id
-> Sort (cost=700394.56..700483.70 rows=35657 width=96) (actual time=9942.731..9942.932 rows=4228 loops=1)
Sort Key: f.region_id,s_2.provider_id
Sort Method: quicksort Memory: 787kB
-> Hash Join (cost=33.59..697698.55 rows=35657 width=96) (actual time=9910.733..9923.439 rows=4228 loops=1)
Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
-> Hash Join (cost=0.58..694942.27 rows=285254 width=96) (actual time=9910.557..9922.318 rows=4228 loops=1)
Hash Cond: (f.source_id = s_2.source_id)
-> Seq Scan on facts f (cost=0.00..672953.32 rows=5102885 width=94) (actual time=0.160..9556.369 rows=4201870 loops=1)
Filter: ((split_part(split_part(region_id,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 7541863
-> Hash (cost=0.36..0.36 rows=18 width=64) (actual time=0.064..0.064 rows=19 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> CTE Scan on allowed_sources s_2 (cost=0.00..0.36 rows=18 width=64) (actual time=0.044..0.060 rows=19 loops=1)
-> Hash (cost=29.50..29.50 rows=200 width=12) (actual time=0.157..0.157 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> HashAggregate (cost=27.50..29.50 rows=200 width=12) (actual time=0.146..0.151 rows=25 loops=1)
Group Key: ftvs.t,ftvs.s
-> CTE Scan on ftvs (cost=0.00..20.00 rows=1000 width=12) (actual time=0.127..0.136 rows=25 loops=1)
CTE fact_counts
-> GroupAggregate (cost=1172.20..1177.10 rows=178 width=128) (actual time=9963.271..9963.843 rows=72 loops=1)
Group Key: r3.region_id,ff.provider_id,(concat(ff.fact_type_id,'-',ff.fact_type_version,ff.fact_subtype_id))
-> Sort (cost=1172.20..1172.65 rows=178 width=112) (actual time=9963.261..9963.368 rows=2421 loops=1)
Sort Key: r3.region_id,ff.fact_subtype_id))
Sort Method: quicksort Memory: 286kB
-> nested Loop Left Join (cost=0.00..1165.55 rows=178 width=112) (actual time=9942.930..9961.788 rows=2421 loops=1)
Join Filter: (ff.region_id_natural_id ~~ r3.region_id_pattern)
Rows Removed by Join Filter: 75051
-> CTE Scan on public_regions r3 (cost=0.00..6.25 rows=1 width=64) (actual time=0.157..0.206 rows=32 loops=1)
Filter: ("substring"(region_id,1,6) = 'TIMXST'::text)
-> CTE Scan on filtered_facts ff (cost=0.00..713.14 rows=35657 width=92) (actual time=310.711..311.022 rows=2421 loops=32)
CTE counts_by_provider
-> HashAggregate (cost=4.89..7.12 rows=178 width=96) (actual time=9963.974..9963.986 rows=33 loops=1)
Group Key: fc.region_id,fc.provider_id
-> CTE Scan on fact_counts fc (cost=0.00..3.56 rows=178 width=128) (actual time=9963.272..9963.871 rows=72 loops=1)
CTE counts_by_type
-> HashAggregate (cost=4.89..7.12 rows=178 width=96) (actual time=0.046..0.065 rows=71 loops=1)
Group Key: fc_1.region_id,fc_1.fact_type
-> CTE Scan on fact_counts fc_1 (cost=0.00..3.56 rows=178 width=96) (actual time=0.000..0.006 rows=72 loops=1)
CTE aggregated_counts_by_provider
-> HashAggregate (cost=4.45..6.68 rows=178 width=64) (actual time=9964.070..9964.082 rows=32 loops=1)
Group Key: cbp.region_id
-> CTE Scan on counts_by_provider cbp (cost=0.00..3.56 rows=178 width=96) (actual time=9963.975..9963.998 rows=33 loops=1)
-> Hash Join (cost=10.68..16.35 rows=222 width=96) (actual time=0.212..0.222 rows=32 loops=1)
Hash Cond: (r.region_id = sub2.region_id)
-> CTE Scan on public_regions r (cost=0.00..5.00 rows=250 width=32) (actual time=0.001..0.003 rows=32 loops=1)
-> Hash (cost=8.46..8.46 rows=178 width=64) (actual time=0.203..0.203 rows=32 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Subquery Scan on sub2 (cost=4.45..8.46 rows=178 width=64) (actual time=0.177..0.192 rows=32 loops=1)
-> HashAggregate (cost=4.45..6.68 rows=178 width=64) (actual time=0.176..0.188 rows=32 loops=1)
Group Key: sub.region_id
-> CTE Scan on counts_by_type sub (cost=0.00..3.56 rows=178 width=96) (actual time=0.046..0.088 rows=71 loops=1)
-> Hash (cost=3.56..3.56 rows=178 width=64) (actual time=9964.110..9964.110 rows=32 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> CTE Scan on aggregated_counts_by_provider acbp (cost=0.00..3.56 rows=178 width=64) (actual time=9964.072..9964.097 rows=32 loops=1)
更多编辑:之前的快速计划来自我本地。由于 prod 更接近 dev(它是 aurora),因此生产计划更重要。这是:
Hash Join (cost=713549.03..713557.81 rows=225 width=96) (actual time=56.873..56.889 rows=32 loops=1)
Hash Cond: (r.region_id = acbp.region_id)
CTE public_regions
-> Hash Join (cost=21.17..27.77 rows=249 width=61) (actual time=0.151..0.200 rows=32 loops=1)
Hash Cond: (r_1.source_id = s.source_id)
-> Index Only Scan using regions_region_id_index on regions r_1 (cost=0.56..4.58 rows=266 width=40) (actual time=0.013..0.034 rows=32 loops=1)
Index Cond: ((region_id ~>=~ 'TIMXST-MX'::text) AND (region_id ~<~ 'TIMXST-MY'::text))
Filter: (region_id ~~ 'TIMXST-MX%'::text)
Heap Fetches: 0
-> Hash (cost=16.45..16.45 rows=333 width=23) (actual time=0.132..0.132 rows=338 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 27kB
-> Seq Scan on sources s (cost=0.00..16.45 rows=333 width=23) (actual time=0.005..0.083 rows=338 loops=1)
Filter: (public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 3
CTE ftvs
-> Function Scan on _two_dimensional_array_to_one_dimensional t (cost=0.25..10.25 rows=1000 width=12) (actual time=0.070..0.074 rows=25 loops=1)
CTE allowed_sources
-> Seq Scan on sources s_1 (cost=0.00..17.34 rows=19 width=31) (actual time=0.004..0.081 rows=19 loops=1)
Filter: ((public OR (provider_id = ANY ('{SSSSSSSS,WWWWWWW}'::text[])))
Rows Removed by Filter: 322
CTE filtered_facts
-> GroupAggregate (cost=707831.57..712200.08 rows=37987 width=112) (actual time=35.313..39.184 rows=2421 loops=1)
Group Key: f.region_id,s_2.provider_id
-> Sort (cost=707831.57..707926.54 rows=37987 width=96) (actual time=35.279..35.523 rows=4228 loops=1)
Sort Key: f.region_id,s_2.provider_id
Sort Method: quicksort Memory: 787kB
-> Hash Join (cost=33.56..704942.05 rows=37987 width=96) (actual time=0.136..16.191 rows=4228 loops=1)
Hash Cond: ((f.fact_type_id = ftvs.t) AND (f.fact_type_version = ftvs.v) AND (f.fact_subtype_id = ftvs.s))
-> nested Loop (cost=0.56..702007.78 rows=303897 width=96) (actual time=0.036..15.181 rows=4228 loops=1)
-> CTE Scan on allowed_sources s_2 (cost=0.00..0.38 rows=19 width=64) (actual time=0.005..0.093 rows=19 loops=1)
-> Index Scan using facts_source_id_idx on facts f (cost=0.56..36787.81 rows=15995 width=94) (actual time=0.507..0.768 rows=223 loops=19)
Index Cond: (source_id = s_2.source_id)
Filter: ((split_part(split_part(region_id,reported_date_end) >= '2019-03-01 00:00:00'::timestamp without time zone))))
Rows Removed by Filter: 603
-> Hash (cost=29.50..29.50 rows=200 width=12) (actual time=0.095..0.095 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> HashAggregate (cost=27.50..29.50 rows=200 width=12) (actual time=0.087..0.091 rows=25 loops=1)
Group Key: ftvs.t,ftvs.s
-> CTE Scan on ftvs (cost=0.00..20.00 rows=1000 width=12) (actual time=0.071..0.080 rows=25 loops=1)
CTE fact_counts
-> GroupAggregate (cost=1248.47..1253.69 rows=190 width=128) (actual time=55.806..56.378 rows=72 loops=1)
Group Key: r3.region_id,ff.fact_subtype_id))
-> Sort (cost=1248.47..1248.94 rows=190 width=112) (actual time=55.795..55.902 rows=2421 loops=1)
Sort Key: r3.region_id,ff.fact_subtype_id))
Sort Method: quicksort Memory: 286kB
-> nested Loop Left Join (cost=0.00..1241.28 rows=190 width=112) (actual time=35.477..54.300 rows=2421 loops=1)
Join Filter: (ff.region_id_natural_id ~~ r3.region_id_pattern)
Rows Removed by Join Filter: 75051
-> CTE Scan on public_regions r3 (cost=0.00..6.23 rows=1 width=64) (actual time=0.153..0.215 rows=32 loops=1)
Filter: ("substring"(region_id,6) = 'TIMXST'::text)
-> CTE Scan on filtered_facts ff (cost=0.00..759.74 rows=37987 width=92) (actual time=1.104..1.417 rows=2421 loops=32)
CTE counts_by_provider
-> HashAggregate (cost=5.23..7.60 rows=190 width=96) (actual time=56.520..56.532 rows=33 loops=1)
Group Key: fc.region_id,fc.provider_id
-> CTE Scan on fact_counts fc (cost=0.00..3.80 rows=190 width=128) (actual time=55.807..56.408 rows=72 loops=1)
CTE counts_by_type
-> HashAggregate (cost=5.23..7.60 rows=190 width=96) (actual time=0.044..0.064 rows=71 loops=1)
Group Key: fc_1.region_id,fc_1.fact_type
-> CTE Scan on fact_counts fc_1 (cost=0.00..3.80 rows=190 width=96) (actual time=0.000..0.005 rows=72 loops=1)
CTE aggregated_counts_by_provider
-> HashAggregate (cost=4.75..7.12 rows=190 width=64) (actual time=56.620..56.633 rows=32 loops=1)
Group Key: cbp.region_id
-> CTE Scan on counts_by_provider cbp (cost=0.00..3.80 rows=190 width=96) (actual time=56.521..56.545 rows=33 loops=1)
-> Hash Join (cost=11.40..17.04 rows=237 width=96) (actual time=0.206..0.215 rows=32 loops=1)
Hash Cond: (r.region_id = sub2.region_id)
-> CTE Scan on public_regions r (cost=0.00..4.98 rows=249 width=32) (actual time=0.001..0.003 rows=32 loops=1)
-> Hash (cost=9.03..9.03 rows=190 width=64) (actual time=0.197..0.197 rows=32 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Subquery Scan on sub2 (cost=4.75..9.03 rows=190 width=64) (actual time=0.175..0.190 rows=32 loops=1)
-> HashAggregate (cost=4.75..7.12 rows=190 width=64) (actual time=0.174..0.186 rows=32 loops=1)
Group Key: sub.region_id
-> CTE Scan on counts_by_type sub (cost=0.00..3.80 rows=190 width=96) (actual time=0.045..0.085 rows=71 loops=1)
-> Hash (cost=3.80..3.80 rows=190 width=64) (actual time=56.662..56.662 rows=32 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> CTE Scan on aggregated_counts_by_provider acbp (cost=0.00..3.80 rows=190 width=64) (actual time=56.622..56.644 rows=32 loops=1)
如您所见,filtered_facts
的读取甚至比本地更好,使用索引扫描而不是位图堆扫描。
解决方法
在快速系统上,索引扫描从 facts
扫描 19*(603+223) = 15694 行,找到 19*223 = 4237 行。在慢速系统上,会扫描 facts
的所有 7541863 + 4201870 = 11743733 行,找到 4201870 行。
如果返回的结果行的百分比足够高,顺序扫描是最有效的访问策略。
所以区别是因为数据库中的数据不同。我认为没有理由怀疑 PostgreSQL 在这里做错了。
,Ey...我终于通过运行 ANALYZE
解决了这个问题。我从不独自跑 ANALYZE
,我总是做 VACUUM ANALYZE
。 AFAIK,这样做应该清理垃圾并更新统计信息。我在某处读到 Aurora 在运行真空时不会更新统计数据,所以我试了一下。也许某些东西被卡在引擎盖下并且统计数据没有正确更新:shrug: