Re: Performance of Query 2 in TPC-H

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/4/24 15:42, Ba Jinsheng wrote:
The estimated cost is reduced by 90%, and the execution time is reduced by 68%. The second query plan includes the operation Memoize, while the first query plan does not. I am wondering if we can optimize the logic anywhere to enable the second query plan.
Thanks for the curious case.
I discovered this case superficially yet, but have some results to discuss.
Postgres doesn't consider the Memoize case here because, inside the JOIN's inner input, we reference the p_partkey column twice, which is the primary key of the 'part' table. So, the cardinality of such a join will always be 1. That's what we exactly watch there:

-> Nested Loop (cost=1049.15..16228.66 rows=1 width=34) (actual time=0.617..39.544 rows=485 loops=1)

And all the joins upstairs are also NestLoop, because of this original error. So, if you want to find a solution - discuss how to estimate correctly clauses like:
(x=PRIMARY KEY) AND (y=PRIMARY KEY).

The second thing here is: if you replace '=' with 'IN':
and ps_supplycost = (select min(ps_supplycost) ...
and ps_supplycost IN (select min(ps_supplycost) ...

You will have pulled up a subquery. On my laptop, this trick accelerated the query from 333ms to 44ms. That's exactly one of the tricks (IMO) that swarm64 used years ago to show that it is faster than upstream Postgres. I want to work around this optimisation a bit later because to do a pull-up, we need to prove the single result of the subquery beforehand.

Also, playing with AQO, as usual, I found two alternative query plans that the optimiser can find in the case of more or less correct cardinality prediction. See these plans in the attachment. I hope they can be useful for your further analysis.

--
regards, Andrei Lepikhov
 Limit  (cost=16231.61..16231.62 rows=1 width=195) (actual time=44.053..44.115 rows=100 loops=1)
   ->  Sort  (cost=16231.61..16231.62 rows=1 width=195) (actual time=44.051..44.109 rows=100 loops=1)
         Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
         Sort Method: top-N heapsort  Memory: 69kB
         ->  Nested Loop  (cost=1049.43..16231.60 rows=1 width=195) (actual time=0.649..43.628 rows=485 loops=1)
               Join Filter: (region.r_regionkey = nation.n_regionkey)
               ->  Nested Loop  (cost=1049.43..16230.53 rows=1 width=199) (actual time=0.642..42.753 rows=485 loops=1)
                     Join Filter: (nation.n_nationkey = supplier.s_nationkey)
                     Rows Removed by Join Filter: 6381
                     ->  Nested Loop  (cost=1049.43..16228.97 rows=1 width=170) (actual time=0.629..41.048 rows=485 loops=1)
                           ->  Nested Loop  (cost=1049.15..16228.66 rows=1 width=34) (actual time=0.617..39.544 rows=485 loops=1)
                                 ->  Gather  (cost=1000.00..6458.40 rows=804 width=30) (actual time=0.422..10.338 rows=826 loops=1)
                                       Workers Planned: 2
                                       Workers Launched: 2
                                       ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.072..14.345 rows=275 loops=3)
                                             Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
                                             Rows Removed by Filter: 66391
                                 ->  Hash Join  (cost=49.15..61.23 rows=1 width=8) (actual time=0.034..0.034 rows=1 loops=826)
                                       Hash Cond: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
                                       ->  Index Scan using partsupp_pkey on partsupp  (cost=0.42..12.50 rows=4 width=14) (actual time=0.004..0.005 rows=4 loops=485)
                                             Index Cond: (ps_partkey = part.p_partkey)
                                       ->  Hash  (cost=48.71..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
                                             Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                             ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
                                                   ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.022..0.028 rows=1 loops=826)
                                                         Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
                                                         Rows Removed by Join Filter: 3
                                                         ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=826)
                                                               Filter: (r_name = 'ASIA'::bpchar)
                                                               Rows Removed by Filter: 4
                                                         ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.026 rows=4 loops=826)
                                                               ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.019 rows=4 loops=826)
                                                                     ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=826)
                                                                           Index Cond: (ps_partkey = part.p_partkey)
                                                                     ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=3304)
                                                                           Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
                                                               ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=3304)
                                                                     Index Cond: (n_nationkey = supplier_1.s_nationkey)
                           ->  Index Scan using supplier_pkey on supplier  (cost=0.29..0.30 rows=1 width=144) (actual time=0.002..0.002 rows=1 loops=485)
                                 Index Cond: (s_suppkey = partsupp.ps_suppkey)
                     ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.001..0.002 rows=14 loops=485)
               ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=485)
                     Filter: (r_name = 'ASIA'::bpchar)
                     Rows Removed by Filter: 2
 Planning Time: 3.864 ms
 Execution Time: 44.298 ms

Limit  (cost=16666.46..16666.71 rows=100 width=195) (actual time=40.686..40.744 rows=100 loops=1)
   AQO not used, fss=0
   ->  Sort  (cost=16666.46..16666.93 rows=187 width=195) (actual time=40.685..40.739 rows=100 loops=1)
         AQO not used, fss=0
         Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
         Sort Method: top-N heapsort  Memory: 73kB
         ->  Nested Loop  (cost=1049.43..16659.40 rows=187 width=195) (actual time=0.246..40.378 rows=485 loops=1)
               AQO: rows=187, error=-159%, fss=1472849439
               Join Filter: (nation.n_nationkey = supplier.s_nationkey)
               Rows Removed by Join Filter: 1940
               ->  Nested Loop  (cost=0.00..2.62 rows=5 width=33) (actual time=0.018..0.024 rows=5 loops=1)
                     AQO: rows=5, error=0%, fss=1568513763
                     Join Filter: (region.r_regionkey = nation.n_regionkey)
                     Rows Removed by Join Filter: 20
                     ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.012..0.013 rows=1 loops=1)
                           AQO: rows=1, error=0%, fss=-1505563826
                           Filter: (r_name = 'ASIA'::bpchar)
                           Rows Removed by Filter: 4
                     ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.002..0.005 rows=25 loops=1)
                           AQO: rows=25, error=0%, fss=-400844413
               ->  Materialize  (cost=1049.43..16621.61 rows=485 width=170) (actual time=0.045..8.009 rows=485 loops=5)
                     AQO: rows=485, error=0%, fss=-666749616
                     ->  Nested Loop  (cost=1049.43..16619.19 rows=485 width=170) (actual time=0.224..39.642 rows=485 loops=1)
                           AQO: rows=485, error=0%, fss=-666749616
                           ->  Nested Loop  (cost=1049.15..16471.86 rows=485 width=34) (actual time=0.220..38.151 rows=485 loops=1)
                                 AQO: rows=485, error=0%, fss=1786374427
                                 ->  Gather  (cost=1000.00..6460.60 rows=826 width=30) (actual time=0.151..9.554 rows=826 loops=1)
                                       AQO: rows=826, error=0%, fss=-156496080
                                       Workers Planned: 2
                                       Workers Launched: 2
                                       ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=344 width=30) (actual time=0.065..13.921 rows=275 loops=3)
                                             AQO: rows=826, error=0%, fss=-156496080
                                             Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
                                             Rows Removed by Filter: 66391
                                 ->  Hash Join  (cost=49.15..61.20 rows=1 width=8) (actual time=0.033..0.033 rows=1 loops=826)
                                       AQO not used, fss=0
                                       Hash Cond: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
                                       ->  Index Scan using partsupp_pkey on partsupp  (cost=0.42..12.47 rows=4 width=14) (actual time=0.004..0.005 rows=4 loops=485)
                                             AQO: rows=4, error=0%, fss=1039583350
                                             Index Cond: (ps_partkey = part.p_partkey)
                                       ->  Hash  (cost=48.71..48.71 rows=1 width=32) (actual time=0.029..0.029 rows=1 loops=826)
                                             Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                             ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.028..0.028 rows=1 loops=826)
                                                   AQO not used, fss=0
                                                   ->  Nested Loop  (cost=0.85..48.70 rows=2 width=6) (actual time=0.022..0.027 rows=1 loops=826)
                                                         AQO: rows=2, error=61%, fss=-1497906424
                                                         Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
                                                         Rows Removed by Join Filter: 3
                                                         ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=826)
                                                               AQO: rows=1, error=0%, fss=-1430902349
                                                               Filter: (r_name = 'ASIA'::bpchar)
                                                               Rows Removed by Filter: 4
                                                         ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.025 rows=4 loops=826)
                                                               AQO: rows=4, error=0%, fss=492761352
                                                               ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.018 rows=4 loops=826)
                                                                     AQO: rows=4, error=0%, fss=-738103366
                                                                     ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=826)
                                                                           AQO: rows=4, error=0%, fss=389205821
                                                                           Index Cond: (ps_partkey = part.p_partkey)
                                                                     ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=3304)
                                                                           AQO: rows=1, error=0%, fss=-1023226617
                                                                           Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
                                                               ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=3304)
                                                                     AQO: rows=1, error=0%, fss=366000422
                                                                     Index Cond: (n_nationkey = supplier_1.s_nationkey)
                           ->  Index Scan using supplier_pkey on supplier  (cost=0.29..0.30 rows=1 width=144) (actual time=0.002..0.002 rows=1 loops=485)
                                 AQO: rows=1, error=0%, fss=-1728777918
                                 Index Cond: (s_suppkey = partsupp.ps_suppkey)
 Planning Time: 8.399 ms
 Execution Time: 41.883 ms
 Using aqo: true
 AQO mode: LEARN
 Query hash: 8280828251003381383
 JOINS: 8
(74 rows)
 Limit  (cost=11147.51..16641.77 rows=100 width=195) (actual time=22.060..26.127 rows=100 loops=1)
   AQO not used, fss=0
   ->  Nested Loop  (cost=11147.51..22136.04 rows=200 width=195) (actual time=22.059..26.119 rows=100 loops=1)
         AQO: rows=200, error=50%, fss=1472849439
         Join Filter: (partsupp.ps_supplycost = (min(partsupp_1.ps_supplycost)))
         Rows Removed by Join Filter: 36
         ->  Gather Merge  (cost=11098.81..11125.13 rows=226 width=201) (actual time=21.995..22.153 rows=136 loops=1)
               AQO: rows=226, error=40%, fss=1303452119
               Workers Planned: 2
               Workers Launched: 2
               ->  Sort  (cost=10098.79..10099.02 rows=94 width=201) (actual time=19.775..19.787 rows=150 loops=3)
                     AQO: rows=226, error=33%, fss=1303452119
                     Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey
                     Sort Method: quicksort  Memory: 79kB
                     Worker 0:  Sort Method: quicksort  Memory: 70kB
                     Worker 1:  Sort Method: quicksort  Memory: 64kB
                     ->  Hash Join  (cost=408.43..10095.70 rows=94 width=201) (actual time=3.274..19.580 rows=216 loops=3)
                           AQO: rows=226, error=-187%, fss=1303452119
                           Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)
                           ->  Nested Loop  (cost=0.42..9679.78 rows=1377 width=40) (actual time=0.054..16.122 rows=1101 loops=3)
                                 AQO: rows=3304, error=0%, fss=1931129930
                                 ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=344 width=30) (actual time=0.034..13.963 rows=275 loops=3)
                                       AQO: rows=826, error=0%, fss=-156496080
                                       Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))
                                       Rows Removed by Filter: 66391
                                 ->  Index Scan using partsupp_pkey on partsupp  (cost=0.42..12.47 rows=4 width=14) (actual time=0.006..0.007 rows=4 loops=826)
                                       AQO: rows=4, error=0%, fss=1039583350
                                       Index Cond: (ps_partkey = part.p_partkey)
                           ->  Hash  (cost=382.96..382.96 rows=2003 width=169) (actual time=3.044..3.046 rows=2003 loops=3)
                                 Buckets: 2048  Batches: 1  Memory Usage: 413kB
                                 ->  Hash Join  (cost=2.46..382.96 rows=2003 width=169) (actual time=0.045..2.424 rows=2003 loops=3)
                                       AQO: rows=2003, error=0%, fss=-70579627
                                       Hash Cond: (supplier.s_nationkey = nation.n_nationkey)
                                       ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.007..0.863 rows=10000 loops=3)
                                             AQO: rows=10000, error=0%, fss=1509014409
                                       ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.030..0.032 rows=5 loops=3)
                                             Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                             ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.024..0.029 rows=5 loops=3)
                                                   AQO: rows=5, error=0%, fss=1568513763
                                                   Hash Cond: (nation.n_regionkey = region.r_regionkey)
                                                   ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.005..0.007 rows=25 loops=3)
                                                         AQO: rows=25, error=0%, fss=-400844413
                                                   ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=3)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                         ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=3)
                                                               AQO: rows=1, error=0%, fss=-1505563826
                                                               Filter: (r_name = 'ASIA'::bpchar)
                                                               Rows Removed by Filter: 4
         ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.028..0.028 rows=1 loops=136)
               AQO not used, fss=0
               ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.016..0.027 rows=2 loops=136)
                     AQO: rows=1, error=-68%, fss=-1497906424
                     Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)
                     Rows Removed by Join Filter: 2
                     ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=136)
                           AQO: rows=1, error=0%, fss=-1430902349
                           Filter: (r_name = 'ASIA'::bpchar)
                           Rows Removed by Filter: 4
                     ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.025 rows=4 loops=136)
                           AQO: rows=4, error=0%, fss=492761352
                           ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.018 rows=4 loops=136)
                                 AQO: rows=4, error=0%, fss=-738103366
                                 ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=136)
                                       AQO: rows=4, error=0%, fss=389205821
                                       Index Cond: (ps_partkey = part.p_partkey)
                                 ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=544)
                                       AQO: rows=1, error=0%, fss=-1023226617
                                       Index Cond: (s_suppkey = partsupp_1.ps_suppkey)
                           ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=544)
                                 AQO: rows=1, error=0%, fss=366000422
                                 Index Cond: (n_nationkey = supplier_1.s_nationkey)
 Planning Time: 9.049 ms
 Execution Time: 27.197 ms
 Using aqo: true
 AQO mode: LEARN
 Query hash: 8280828251003381383
 JOINS: 8
(77 rows)


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux