On 11/4/24 15:42, Ba Jinsheng wrote:
The estimated cost is reduced by 90%, and the execution time is reduced
by 68%. The second query plan includes the operation Memoize, while the
first query plan does not. I am wondering if we can optimize the logic
anywhere to enable the second query plan.
Thanks for the curious case.
I discovered this case superficially yet, but have some results to discuss.
Postgres doesn't consider the Memoize case here because, inside the
JOIN's inner input, we reference the p_partkey column twice, which is
the primary key of the 'part' table. So, the cardinality of such a join
will always be 1. That's what we exactly watch there:
-> Nested Loop (cost=1049.15..16228.66 rows=1 width=34) (actual
time=0.617..39.544 rows=485 loops=1)
And all the joins upstairs are also NestLoop, because of this original
error. So, if you want to find a solution - discuss how to estimate
correctly clauses like:
(x=PRIMARY KEY) AND (y=PRIMARY KEY).
The second thing here is: if you replace '=' with 'IN':
and ps_supplycost = (select min(ps_supplycost) ...
and ps_supplycost IN (select min(ps_supplycost) ...
You will have pulled up a subquery. On my laptop, this trick accelerated
the query from 333ms to 44ms. That's exactly one of the tricks (IMO)
that swarm64 used years ago to show that it is faster than upstream
Postgres.
I want to work around this optimisation a bit later because to do a
pull-up, we need to prove the single result of the subquery beforehand.
Also, playing with AQO, as usual, I found two alternative query plans
that the optimiser can find in the case of more or less correct
cardinality prediction. See these plans in the attachment. I hope they
can be useful for your further analysis.
--
regards, Andrei Lepikhov
Limit (cost=16231.61..16231.62 rows=1 width=195) (actual time=44.053..44.115 rows=100 loops=1)
-> Sort (cost=16231.61..16231.62 rows=1 width=195) (actual time=44.051..44.109 rows=100 loops=1)
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
Sort Method: top-N heapsort Memory: 69kB
-> Nested Loop (cost=1049.43..16231.60 rows=1 width=195) (actual time=0.649..43.628 rows=485 loops=1)
Join Filter: (region.r_regionkey = nation.n_regionkey)
-> Nested Loop (cost=1049.43..16230.53 rows=1 width=199) (actual time=0.642..42.753 rows=485 loops=1)
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
Rows Removed by Join Filter: 6381
-> Nested Loop (cost=1049.43..16228.97 rows=1 width=170) (actual time=0.629..41.048 rows=485 loops=1)
-> Nested Loop (cost=1049.15..16228.66 rows=1 width=34) (actual time=0.617..39.544 rows=485 loops=1)
-> Gather (cost=1000.00..6458.40 rows=804 width=30) (actual time=0.422..10.338 rows=826 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on part (cost=0.00..5378.00 rows=335 width=30) (actual time=0.072..14.345 rows=275 loops=3)
Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
Rows Removed by Filter: 66391
-> Hash Join (cost=49.15..61.23 rows=1 width=8) (actual time=0.034..0.034 rows=1 loops=826)
Hash Cond: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
-> Index Scan using partsupp_pkey on partsupp (cost=0.42..12.50 rows=4 width=14) (actual time=0.004..0.005 rows=4 loops=485)
Index Cond: (ps_partkey = part.p_partkey)
-> Hash (cost=48.71..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Aggregate (cost=48.70..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
-> Nested Loop (cost=0.85..48.70 rows=1 width=6) (actual time=0.022..0.028 rows=1 loops=826)
Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
Rows Removed by Join Filter: 3
-> Seq Scan on region region_1 (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=826)
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 4
-> Nested Loop (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.026 rows=4 loops=826)
-> Nested Loop (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.019 rows=4 loops=826)
-> Index Scan using partsupp_pkey on partsupp partsupp_1 (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=826)
Index Cond: (ps_partkey = part.p_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1 (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=3304)
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=3304)
Index Cond: (n_nationkey = supplier_1.s_nationkey)
-> Index Scan using supplier_pkey on supplier (cost=0.29..0.30 rows=1 width=144) (actual time=0.002..0.002 rows=1 loops=485)
Index Cond: (s_suppkey = partsupp.ps_suppkey)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=37) (actual time=0.001..0.002 rows=14 loops=485)
-> Seq Scan on region (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=485)
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 2
Planning Time: 3.864 ms
Execution Time: 44.298 ms
Limit (cost=16666.46..16666.71 rows=100 width=195) (actual time=40.686..40.744 rows=100 loops=1)
AQO not used, fss=0
-> Sort (cost=16666.46..16666.93 rows=187 width=195) (actual time=40.685..40.739 rows=100 loops=1)
AQO not used, fss=0
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
Sort Method: top-N heapsort Memory: 73kB
-> Nested Loop (cost=1049.43..16659.40 rows=187 width=195) (actual time=0.246..40.378 rows=485 loops=1)
AQO: rows=187, error=-159%, fss=1472849439
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
Rows Removed by Join Filter: 1940
-> Nested Loop (cost=0.00..2.62 rows=5 width=33) (actual time=0.018..0.024 rows=5 loops=1)
AQO: rows=5, error=0%, fss=1568513763
Join Filter: (region.r_regionkey = nation.n_regionkey)
Rows Removed by Join Filter: 20
-> Seq Scan on region (cost=0.00..1.06 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
AQO: rows=1, error=0%, fss=-1505563826
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 4
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=37) (actual time=0.002..0.005 rows=25 loops=1)
AQO: rows=25, error=0%, fss=-400844413
-> Materialize (cost=1049.43..16621.61 rows=485 width=170) (actual time=0.045..8.009 rows=485 loops=5)
AQO: rows=485, error=0%, fss=-666749616
-> Nested Loop (cost=1049.43..16619.19 rows=485 width=170) (actual time=0.224..39.642 rows=485 loops=1)
AQO: rows=485, error=0%, fss=-666749616
-> Nested Loop (cost=1049.15..16471.86 rows=485 width=34) (actual time=0.220..38.151 rows=485 loops=1)
AQO: rows=485, error=0%, fss=1786374427
-> Gather (cost=1000.00..6460.60 rows=826 width=30) (actual time=0.151..9.554 rows=826 loops=1)
AQO: rows=826, error=0%, fss=-156496080
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on part (cost=0.00..5378.00 rows=344 width=30) (actual time=0.065..13.921 rows=275 loops=3)
AQO: rows=826, error=0%, fss=-156496080
Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
Rows Removed by Filter: 66391
-> Hash Join (cost=49.15..61.20 rows=1 width=8) (actual time=0.033..0.033 rows=1 loops=826)
AQO not used, fss=0
Hash Cond: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
-> Index Scan using partsupp_pkey on partsupp (cost=0.42..12.47 rows=4 width=14) (actual time=0.004..0.005 rows=4 loops=485)
AQO: rows=4, error=0%, fss=1039583350
Index Cond: (ps_partkey = part.p_partkey)
-> Hash (cost=48.71..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Aggregate (cost=48.70..48.71 rows=1 width=32) (actual time=0.028..0.028 rows=1 loops=826)
AQO not used, fss=0
-> Nested Loop (cost=0.85..48.70 rows=2 width=6) (actual time=0.022..0.027 rows=1 loops=826)
AQO: rows=2, error=61%, fss=-1497906424
Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
Rows Removed by Join Filter: 3
-> Seq Scan on region region_1 (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=826)
AQO: rows=1, error=0%, fss=-1430902349
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 4
-> Nested Loop (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.025 rows=4 loops=826)
AQO: rows=4, error=0%, fss=492761352
-> Nested Loop (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.018 rows=4 loops=826)
AQO: rows=4, error=0%, fss=-738103366
-> Index Scan using partsupp_pkey on partsupp partsupp_1 (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=826)
AQO: rows=4, error=0%, fss=389205821
Index Cond: (ps_partkey = part.p_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1 (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=3304)
AQO: rows=1, error=0%, fss=-1023226617
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=3304)
AQO: rows=1, error=0%, fss=366000422
Index Cond: (n_nationkey = supplier_1.s_nationkey)
-> Index Scan using supplier_pkey on supplier (cost=0.29..0.30 rows=1 width=144) (actual time=0.002..0.002 rows=1 loops=485)
AQO: rows=1, error=0%, fss=-1728777918
Index Cond: (s_suppkey = partsupp.ps_suppkey)
Planning Time: 8.399 ms
Execution Time: 41.883 ms
Using aqo: true
AQO mode: LEARN
Query hash: 8280828251003381383
JOINS: 8
(74 rows)
Limit (cost=11147.51..16641.77 rows=100 width=195) (actual time=22.060..26.127 rows=100 loops=1)
AQO not used, fss=0
-> Nested Loop (cost=11147.51..22136.04 rows=200 width=195) (actual time=22.059..26.119 rows=100 loops=1)
AQO: rows=200, error=50%, fss=1472849439
Join Filter: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
Rows Removed by Join Filter: 36
-> Gather Merge (cost=11098.81..11125.13 rows=226 width=201) (actual time=21.995..22.153 rows=136 loops=1)
AQO: rows=226, error=40%, fss=1303452119
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=10098.79..10099.02 rows=94 width=201) (actual time=19.775..19.787 rows=150 loops=3)
AQO: rows=226, error=33%, fss=1303452119
Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
Sort Method: quicksort Memory: 79kB
Worker 0: Sort Method: quicksort Memory: 70kB
Worker 1: Sort Method: quicksort Memory: 64kB
-> Hash Join (cost=408.43..10095.70 rows=94 width=201) (actual time=3.274..19.580 rows=216 loops=3)
AQO: rows=226, error=-187%, fss=1303452119
Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)
-> Nested Loop (cost=0.42..9679.78 rows=1377 width=40) (actual time=0.054..16.122 rows=1101 loops=3)
AQO: rows=3304, error=0%, fss=1931129930
-> Parallel Seq Scan on part (cost=0.00..5378.00 rows=344 width=30) (actual time=0.034..13.963 rows=275 loops=3)
AQO: rows=826, error=0%, fss=-156496080
Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
Rows Removed by Filter: 66391
-> Index Scan using partsupp_pkey on partsupp (cost=0.42..12.47 rows=4 width=14) (actual time=0.006..0.007 rows=4 loops=826)
AQO: rows=4, error=0%, fss=1039583350
Index Cond: (ps_partkey = part.p_partkey)
-> Hash (cost=382.96..382.96 rows=2003 width=169) (actual time=3.044..3.046 rows=2003 loops=3)
Buckets: 2048 Batches: 1 Memory Usage: 413kB
-> Hash Join (cost=2.46..382.96 rows=2003 width=169) (actual time=0.045..2.424 rows=2003 loops=3)
AQO: rows=2003, error=0%, fss=-70579627
Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
-> Seq Scan on supplier (cost=0.00..323.00 rows=10000 width=144) (actual time=0.007..0.863 rows=10000 loops=3)
AQO: rows=10000, error=0%, fss=1509014409
-> Hash (cost=2.40..2.40 rows=5 width=33) (actual time=0.030..0.032 rows=5 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Join (cost=1.07..2.40 rows=5 width=33) (actual time=0.024..0.029 rows=5 loops=3)
AQO: rows=5, error=0%, fss=1568513763
Hash Cond: (nation.n_regionkey = region.r_regionkey)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=37) (actual time=0.005..0.007 rows=25 loops=3)
AQO: rows=25, error=0%, fss=-400844413
-> Hash (cost=1.06..1.06 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on region (cost=0.00..1.06 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=3)
AQO: rows=1, error=0%, fss=-1505563826
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 4
-> Aggregate (cost=48.70..48.71 rows=1 width=32) (actual time=0.028..0.028 rows=1 loops=136)
AQO not used, fss=0
-> Nested Loop (cost=0.85..48.70 rows=1 width=6) (actual time=0.016..0.027 rows=2 loops=136)
AQO: rows=1, error=-68%, fss=-1497906424
Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
Rows Removed by Join Filter: 2
-> Seq Scan on region region_1 (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=136)
AQO: rows=1, error=0%, fss=-1430902349
Filter: (r_name = 'ASIA'::bpchar)
Rows Removed by Filter: 4
-> Nested Loop (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.025 rows=4 loops=136)
AQO: rows=4, error=0%, fss=492761352
-> Nested Loop (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.018 rows=4 loops=136)
AQO: rows=4, error=0%, fss=-738103366
-> Index Scan using partsupp_pkey on partsupp partsupp_1 (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=136)
AQO: rows=4, error=0%, fss=389205821
Index Cond: (ps_partkey = part.p_partkey)
-> Index Scan using supplier_pkey on supplier supplier_1 (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=544)
AQO: rows=1, error=0%, fss=-1023226617
Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
-> Index Scan using nation_pkey on nation nation_1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=544)
AQO: rows=1, error=0%, fss=366000422
Index Cond: (n_nationkey = supplier_1.s_nationkey)
Planning Time: 9.049 ms
Execution Time: 27.197 ms
Using aqo: true
AQO mode: LEARN
Query hash: 8280828251003381383
JOINS: 8
(77 rows)