On Fri, 4 Feb 2022 at 01:03, A Shaposhnikov <artyom@xxxxxxxxx> wrote:
I made a mistake yesterday claiming that the created statistics
changed the row counts in the estimates - it did not - I looked at
the wrong query yesterday. In the correct query plan the row estimate
still differs from the actual by many orders of magnitude:
Nested Loop (cost=1.01..27584834.53 rows=59608439 width=164) (actual
time=0.047..2.723 rows=854 loops=1)
It completely ignores the "limit" in SQL statement when doing the row
estimates - the estimates do not change when I change it. The business
logic really needs only 1000 rows in the result, I never need the
whole table in the result. The query always runs fast if the limit is
<= 46, when the limit >=47, it sometimes chooses to do a merge join on
2 tables with hundreds of millions of rows instead of using the
indexes. The runtime difference is 4000x.
I migrated the data to the latest postgres 14.1. Both versions run
either the slow plan or the fast plan seemingly at random. The only
reliable way to make it choose the fast plan is to decrease the
effective_cache_size to a value 20x lower than the memory available in
the system. Dropping and creating the statistics on the join table
makes no difference in the estimated row counts and query plans.
PostgreSql seems to be caching something internally and choosing the
query plans at random - sometimes it is fast, sometimes 4000x slower
without any changes in configuration or statistics. The runtime
difference is 4000x. Such randomness is clearly unacceptable. I think
I would have to try to use the "hint" extension suggested by Imre to
make it consistently choose the fast plan.
there was a similar concern raised on slack, where on shared plan (where less effective_cache_size resulted in faster execution)
"
Hey all, I have been diagnosing a slow running query and found that it speeds up when the
effective_cache_size
is reduced. It's set at ~21.5gb, and lowering to 10gb or under massively improves the query. Plans are here and here. You can see the difference is 33 - 42, where the order of execution results in a bunch more rows having to be filtered out later in the first plan. Everything I've read suggests that a bigger effective_cache_size
should be better (assuming it's within the recommended boundaries of total memory of course). Can anyone offer any insight into why this might happen, and what a good workaround would be?"
I am not sure, though
effective_cache_size
is the culprit. if there are bad estimates, even a simple query like below produces a poor plan.i did try to check the code postgres/costsize.c at master · postgres/postgres (github.com) (I am not a developer but try to read the english from code) and tried to understand how effective_cache_sizealters the cost and changes the path, but could not simulate on my laptop with small tables and less resources.
but yeah, i know 1TB is a huge setup, but can run vaccumdb and analyze all the tables in question (maybe reindex concurrently if required) and check if the row stats are still off in the plan and reducing effective_cache_size improves the execution time? I am not an expert, but just try to figure out if this is a red herring.
i also play with some gucs as in PostgreSQL: Documentation: 13: 19.7. Query Planning to check if enabling/disabling some of gucs results in a big difference in execution and then try to zero in what needs immediate vaccum or reindex or join count etc or bumping resources if required to use more cache.
postgres=# create table t(id int primary key, col1 int);
CREATE TABLE
postgres=# alter table t set (autovacuum_enabled = false, toast.autovacuum_enabled = false);
ALTER TABLE
postgres=# insert into t select x, 1 from generate_series(1, 1000000) x;
INSERT 0 1000000
postgres=# \x
Expanded display is off.
postgres=# do $$
declare i int;
begin
for i in 1..10 loop
update t set col1 = i::int;
commit;
end loop;
end; $$;
DO
---this is the default plan picked up by optimizer (tables still not analyzed)
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..1472242.50 rows=28995800 width=8) (actual time=198.740..4849.556 rows=99 loops=1)
Filter: (id < 100)
Rows Removed by Filter: 999901
Planning Time: 1.296 ms
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.194 ms, Inlining 4.383 ms, Optimization 13.051 ms, Emission 6.954 ms, Total 25.581 ms
Execution Time: 4850.870 ms
(9 rows)
--play smart, disable seqscan
postgres=# set enable_seqscan TO 0;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=235689.89..1646219.33 rows=28995800 width=8) (actual time=1.312..2.313 rows=99 loops=1)
Recheck Cond: (id < 100)
Heap Blocks: exact=32
-> Bitmap Index Scan on t_pkey (cost=0.00..228440.94 rows=28995800 width=0) (actual time=1.295..1.296 rows=126 loops=1)
Index Cond: (id < 100)
Planning Time: 0.054 ms
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.309 ms, Inlining 0.000 ms, Optimization 0.000 ms, Emission 0.000 ms, Total 7.309 ms
Execution Time: 9.715 ms
(11 rows)
# rule out cache
postgres=# set enable_seqscan TO 1;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..1472242.50 rows=28995800 width=8) (actual time=196.485..4778.942 rows=99 loops=1)
Filter: (id < 100)
Rows Removed by Filter: 999901
Planning Time: 0.052 ms
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.286 ms, Inlining 1.389 ms, Optimization 4.462 ms, Emission 2.765 ms, Total 8.902 ms
Execution Time: 4779.314 ms
(9 rows)
#again disable seqscan
postgres=# set enable_seqscan TO 0;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=235689.89..1646219.33 rows=28995800 width=8) (actual time=0.028..0.101 rows=99 loops=1)
Recheck Cond: (id < 100)
Heap Blocks: exact=32
-> Bitmap Index Scan on t_pkey (cost=0.00..228440.94 rows=28995800 width=0) (actual time=0.017..0.017 rows=126 loops=1)
Index Cond: (id < 100)
Planning Time: 0.047 ms
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.834 ms, Inlining 0.000 ms, Optimization 0.000 ms, Emission 0.000 ms, Total 0.834 ms
Execution Time: 0.978 ms
(11 rows)
#disable jit (even faster)
postgres=# set jit TO 0;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=235689.89..1646219.33 rows=28995800 width=8) (actual time=0.030..0.115 rows=99 loops=1)
Recheck Cond: (id < 100)
Heap Blocks: exact=32
-> Bitmap Index Scan on t_pkey (cost=0.00..228440.94 rows=28995800 width=0) (actual time=0.018..0.018 rows=126 loops=1)
Index Cond: (id < 100)
Planning Time: 0.053 ms
Execution Time: 0.137 ms
(7 rows)
#disable bitmap scan (more faster)
postgres=# set enable_bitmapscan TO 0;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.44..2057998.94 rows=28995800 width=8) (actual time=0.018..0.147 rows=99 loops=1)
Index Cond: (id < 100)
Planning Time: 0.053 ms
Execution Time: 0.166 ms
(4 rows)
#now cleanup dead tuples, and analyze to update stats and reset all gucs
postgres=# vacuum full t; -- note full not required, 'vacuum analyze t' would suffice too.
VACUUM
postgres=# set enable_bitmapscan TO default;
SET
postgres=# set enable_seqscan TO default;
SET
postgres=# set jit to default;
SET
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=6243.76..14835.42 rows=333333 width=8) (actual time=0.028..0.042 rows=99 loops=1)
Recheck Cond: (id < 100)
Heap Blocks: exact=3
-> Bitmap Index Scan on t_pkey (cost=0.00..6160.42 rows=333333 width=0) (actual time=0.017..0.018 rows=99 loops=1)
Index Cond: (id < 100)
Planning Time: 0.819 ms
Execution Time: 0.061 ms
(7 rows)
postgres=# analyze t;
ANALYZE
#the default plan after vacuum analyze is the fastest.
postgres=# explain analyze select * from t where id < 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.42..10.62 rows=92 width=8) (actual time=0.005..0.029 rows=99 loops=1)
Index Cond: (id < 100)
Planning Time: 0.159 ms
Execution Time: 0.045 ms
(4 rows)