问题描述
我们有一个交易表,大约。 10m 行且不断增长。我们的每个客户都指定了许多规则,这些规则根据位置、相关产品、销售客户等将某些交易组合在一起。根据这些规则,我们每晚生成报告,使他们能够查看客户为产品支付的价格与他们的购买价格从不同的价目表来看,这些价目表每天都在变化,我们必须在交易的每个日期找到它们设定的年度价格或交易日期的有效价格。
这些价目表可能会随着历史变化而不断变化,就像添加的新历史交易一样,因此我们必须在每个财政年度内继续重新生成这些报告。
我们在必须执行的两种价目表/价格连接方面遇到了问题。第一个是在设定的年度价目表上。
我删除了将事务引入并放入名为 transaction_data_6787 的表中的查询。
EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LAteraL
(
SELECT p."Price"
FROM "Prices" p
INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
WHERE (pl."CustomerId" = 20)
AND (pl."Year" = 2020)
AND (pl."PriceListTypeId" = 2)
AND p."ProductId" = t.product_id
limit 1
) AS prices ON true
nested Loop (cost=0.70..133877.20 rows=5394 width=165) (actual time=0.521..193.638 rows=5394 loops=1) -> Seq Scan on transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=145) (actual time=0.005..0.593 rows=5394 loops=1) -> Limit (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
-> nested Loop (cost=0.70..24.77 rows=1 width=20) (actual time=0.035..0.035 rows=1 loops=5394)
-> Index Scan using ix_prices_covering on "Prices" p (cost=0.42..8.44 rows=1 width=16) (actual time=0.006..0.015 rows=23 loops=5394)
Index Cond: (("ProductId" = t.product_id))
-> Index Scan using ix_pricelists_covering on "PriceLists" pl (cost=0.28..8.30 rows=1 width=12) (actual time=0.001..0.001 rows=0 loops=122443)
Index Cond: (("Id" = p."PriceListId") AND ("CustomerId" = 20) AND ("PriceListTypeId" = 2))
Filter: ("Year" = 2020)
Rows Removed by Filter: 0 Planning Time: 0.307 ms Execution Time: 193.982 ms
如果我删除 LIMIT 1,执行时间会下降到 3 毫秒,并且不会发生 ix_pricelists_covering 上的 122443 次循环。我们进行横向连接的原因是价格查询是动态构建的,有时当不加入年度价格表时,我们加入有效价格表。如下所示:
EXPLAIN analyze
SELECT *
FROM transaction_data_6787 t
inner JOIN LAteraL
(
SELECT p."Price"
FROM "Prices" p
INNER JOIN "PriceLists" pl on p."PriceListId" = pl."Id"
WHERE (pl."CustomerId" = 20)
AND (pl."PriceListTypeId" = 1)
AND p."ProductId" = t.product_id
and pl."ValidFromDate" <= t.transaction_date
ORDER BY pl."ValidFromDate" desc
limit 1
) AS prices ON true
这正在扼杀我们的性能,有些查询需要 20 秒,而且当我们不按日期 desc/limit 1 订购时,它会在 ms 内完成,但我们可能会得到重复的价格。
如果有更好的方式加入最新记录,我们很乐意重写。我们有数千个价目表和 10 万个价格,每笔交易可能有 100 个甚至 1000 个有效价格,我们需要确保我们获得在交易日期对产品最近有效的价格。>
我发现如果我将价格表/价格非规范化到一个表中并添加一个带有 ValidFromDate DESC 的索引,它似乎消除了循环,但我对非规范化并不得不维护该数据犹豫不决,这些报告可以临时运行以及批处理作业,我们必须实时维护这些数据。
更新解释/分析:
我在查询下方添加了需要获取交易日期最近生效的价格。我现在看到,当
我仍然看到执行大量循环的较慢查询,200k+(当包含限制 1/
也许更好的问题是我们可以做什么而不是横向连接,这将使我们能够以最有效/最高效的方式连接交易的有效价格。我希望避免对数据进行非规范化和维护,但如果这是我们做的唯一方法。如果有一种方法可以重写它而不是非规范化,那么我真的很感激任何见解。
nested Loop (cost=14.21..76965.60 rows=5394 width=10) (actual time=408.948..408.950 rows=0 loops=1)
Output: t.transaction_id,pr."Price"
Buffers: shared hit=688022
-> Seq Scan on public.transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=29) (actual time=0.018..0.682 rows=5394 loops=1)
Output: t.transaction_id
Buffers: shared hit=106
-> Limit (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
Output: pr."Price",pl."ValidFromDate"
Buffers: shared hit=687916
-> Sort (cost=14.21..14.22 rows=1 width=10) (actual time=0.075..0.075 rows=0 loops=5394)
Output: pr."Price",pl."ValidFromDate"
Sort Key: pl."ValidFromDate" DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=687916
-> nested Loop (cost=0.70..14.20 rows=1 width=10) (actual time=0.074..0.074 rows=0 loops=5394)
Output: pr."Price",pl."ValidFromDate"
Inner Unique: true
Buffers: shared hit=687916
-> Index Only Scan using ix_prices_covering on public."Prices" pr (cost=0.42..4.44 rows=1 width=10) (actual time=0.007..0.019 rows=51 loops=5394)
Output: pr."ProductId",pr."ValidFromDate",pr."Id",pr."Price",pr."PriceListId"
Index Cond: (pr."ProductId" = t.product_id)
Heap Fetches: 0
Buffers: shared hit=17291
-> Index Scan using ix_pricelists_covering on public."PriceLists" pl (cost=0.28..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=273678)
Output: pl."Id",pl."Name",pl."CustomerId",pl."ValidFromDate",pl."PriceListTypeId"
Index Cond: ((pl."Id" = pr."PriceListId") AND (pl."CustomerId" = 20) AND (pl."PriceListTypeId" = 1))
Filter: (pl."ValidFromDate" <= t.transaction_date)
Rows Removed by Filter: 0
Buffers: shared hit=670625
Planning Time: 1.254 ms
Execution Time: 409.088 ms
Gather (cost=6395.67..7011.99 rows=68 width=10) (actual time=92.481..92.554 rows=0 loops=1)
Output: t.transaction_id,pr."Price"
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1466 read=2
-> Hash Join (cost=5395.67..6005.19 rows=28 width=10) (actual time=75.126..75.129 rows=0 loops=3)
Output: t.transaction_id,pr."Price"
Inner Unique: true
Hash Cond: (pr."PriceListId" = pl."Id")
Join Filter: (pl."ValidFromDate" <= t.transaction_date)
Rows Removed by Join Filter: 41090
Buffers: shared hit=1466 read=2
Worker 0: actual time=64.707..64.709 rows=0 loops=1
Buffers: shared hit=462
Worker 1: actual time=72.545..72.547 rows=0 loops=1
Buffers: shared hit=550 read=1
-> Merge Join (cost=5374.09..5973.85 rows=3712 width=18) (actual time=26.804..61.492 rows=91226 loops=3)
Output: t.transaction_id,t.transaction_date,pr."PriceListId"
Merge Cond: (pr."ProductId" = t.product_id)
Buffers: shared hit=1325 read=2
Worker 0: actual time=17.677..51.590 rows=83365 loops=1
Buffers: shared hit=400
Worker 1: actual time=24.995..59.395 rows=103814 loops=1
Buffers: shared hit=488 read=1
-> Parallel Index Only Scan using ix_prices_covering on public."Prices" pr (cost=0.42..7678.38 rows=79544 width=29) (actual time=0.036..12.136 rows=42281 loops=3)
Output: pr."ProductId",pr."PriceListId"
Heap Fetches: 0
Buffers: shared hit=989 read=2
Worker 0: actual time=0.037..9.660 rows=36873 loops=1
Buffers: shared hit=285
Worker 1: actual time=0.058..13.459 rows=47708 loops=1
Buffers: shared hit=373 read=1
-> Sort (cost=494.29..507.78 rows=5394 width=29) (actual time=9.037..14.700 rows=94555 loops=3)
Output: t.transaction_id,t.product_id,t.transaction_date
Sort Key: t.product_id
Sort Method: quicksort Memory: 614kB
Worker 0: Sort Method: quicksort Memory: 614kB
Worker 1: Sort Method: quicksort Memory: 614kB
Buffers: shared hit=336
Worker 0: actual time=6.608..12.034 rows=86577 loops=1
Buffers: shared hit=115
Worker 1: actual time=8.973..14.598 rows=107126 loops=1
Buffers: shared hit=115
-> Seq Scan on public.transaction_data_6787 t (cost=0.00..159.94 rows=5394 width=29) (actual time=0.020..2.948 rows=5394 loops=3)
Output: t.transaction_id,t.transaction_date
Buffers: shared hit=318
Worker 0: actual time=0.017..2.078 rows=5394 loops=1
Buffers: shared hit=106
Worker 1: actual time=0.027..2.976 rows=5394 loops=1
Buffers: shared hit=106
-> Hash (cost=21.21..21.21 rows=30 width=8) (actual time=0.145..0.145 rows=35 loops=3)
Output: pl."Id",pl."ValidFromDate"
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=53
Worker 0: actual time=0.137..0.138 rows=35 loops=1
Buffers: shared hit=18
Worker 1: actual time=0.149..0.150 rows=35 loops=1
Buffers: shared hit=18
-> Bitmap Heap Scan on public."PriceLists" pl (cost=4.59..21.21 rows=30 width=8) (actual time=0.067..0.114 rows=35 loops=3)
Output: pl."Id",pl."ValidFromDate"
Recheck Cond: (pl."CustomerId" = 20)
Filter: (pl."PriceListTypeId" = 1)
Rows Removed by Filter: 6
Heap Blocks: exact=15
Buffers: shared hit=53
Worker 0: actual time=0.068..0.108 rows=35 loops=1
Buffers: shared hit=18
Worker 1: actual time=0.066..0.117 rows=35 loops=1
Buffers: shared hit=18
-> Bitmap Index Scan on "IX_PriceLists_CustomerId" (cost=0.00..4.58 rows=41 width=0) (actual time=0.049..0.049 rows=41 loops=3)
Index Cond: (pl."CustomerId" = 20)
Buffers: shared hit=8
Worker 0: actual time=0.053..0.054 rows=41 loops=1
Buffers: shared hit=3
Worker 1: actual time=0.048..0.048 rows=41 loops=1
Buffers: shared hit=3
Planning Time: 2.236 ms
Execution Time: 92.814 ms
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)