Hello,
We have several select statements whose performance is greatly improved by deleting some stats from pg_statistic. With the stats present the database reaches 100% cpu at 13k queries per second. Without these stats, the same machine can handle over 29k queries per second. We were able replicate this behavior with just a single join that all these queries contain. When the stats are present the planner chooses to hash join, and without stats perform a nested loop. The plan using a hash join has a higher estimated cost, and as previously mentioned, uses more cpu.
The two tables involved in this query are described below; bag_type and bag. There are 6 bag_type rows and around 6 million bag rows. During this simplified scenario, no writes were occurring. Under normal circumstances rows can be inserted into bag, and no rows in these tables are updated or deleted.
\d bag_type
Table "public.bag_type"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+--------------------------------------
id | bigint | | not null | nextval('bag_type_id_seq'::regclass)
name | text | | not null |
has_slots | boolean | | not null |
game | text | | not null |
Indexes:
"bag_type_pk" PRIMARY KEY, btree (id)
"bag_name_u1" UNIQUE CONSTRAINT, btree (name, game)
Referenced by:
TABLE "bag" CONSTRAINT "bag_fk1" FOREIGN KEY (bag_type_id) REFERENCES bag_type(id)
\d bag
Table "public.bag"
Column | Type | Collation | Nullable | Default
-------------+--------+-----------+----------+---------------------------------
id | bigint | | not null | nextval('bag_id_seq'::regclass)
owner_id | uuid | | not null |
bag_type_id | bigint | | not null |
Indexes:
"bag_pk" PRIMARY KEY, btree (id)
"bag_owner_type_u1" UNIQUE CONSTRAINT, btree (owner_id, bag_type_id)
Foreign-key constraints:
"bag_fk1" FOREIGN KEY (bag_type_id) REFERENCES bag_type(id)
Referenced by:
TABLE "item" CONSTRAINT "item_fk1" FOREIGN KEY (bag_id) REFERENCES bag(id)
The pared down query joins the two tables.
EXPLAIN (ANALYZE, BUFFERS)
SELECT 1
FROM bag
INNER JOIN bag_type ON bag.bag_type_id = bag_type.id
WHERE owner_id = '00000000-0000-0000-0000-000000076100'
AND game = 'test_alpha'
AND name = ANY(ARRAY['item','wallet','buildingFixed']);
With stats on the bag_type table present, the planner uses a hash join. I noticed that the estimate of the index scan of bag_owner_type_u1 is too high at 8 rows. No owner can have more than 6 bags, so 8 should be logically impossible. Also, given 3 bag_types and a specific owner, there can't be more than 3 rows due to the bag_owner_type_u1 index.
ANALYZE bag_type;
https://explain.depesz.com/s/zcI (Slower, hash join)
If I remove the stats on the bag_type table, the planner estimates 1 row and uses a nested loop.
DELETE FROM pg_statistic s
USING pg_class c
WHERE c.oid = s.starelid
AND c.relname = 'bag_type';
https://explain.depesz.com/s/yBuEo (nested loop)
Below are various stats and configuration options, in case they are helpful. I've tried reindexing everything, clustering the tables and ran vacuum full as well. I've tried increasing the default statistics target (this actually made performance much worse). I’ve tested this on fresh volumes with synthetic data, as well as on replicas of prod data. I’ve also tested this on different ec2 instance types (r4.16xl and c4.8xl). In all cases the bag_type stats resulted in worse performance. I was hoping someone would be able to give advice on how to improve these queries that doesn’t involve deleting stats.
Thanks
--Jeremy
SELECT version();
version
-------------------------------------------------------------------------------------------------------------------------
PostgreSQL 10.7 (Debian 10.7-1.pgdg80+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10+deb8u2) 4.9.2, 64-bit
(1 row)
SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='bag_type' OR relname = 'bag';
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------+----------+-------------+---------------+---------+----------+----------------+------------+---------------
bag | 44115 | 5.99964e+06 | 0 | r | 3 | f | | 361390080
bag_type | 1 | 6 | 0 | r | 4 | f | | 16384
(2 rows)
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='bag_type_id' AND tablename='bag' ORDER BY 1 DESC;
frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
----------+-----------+-------------+-----------+-----------+------------+-------+--------+-------------
1 | bag | bag_type_id | f | 0 | 6 | 6 | | 0.167682
(1 row)
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='owner_id' AND tablename='bag' ORDER BY 1 DESC;
frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
------------+-----------+----------+-----------+-----------+------------+-------+--------+-------------
0.00680001 | bag | owner_id | f | 0 | -0.123982 | 100 | 101 | 0.994306
(1 row)
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='name' AND tablename='bag_type' ORDER BY 1 DESC;
frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
----------+-----------+---------+-----------+-----------+------------+-------+--------+-------------
| bag_type | name | f | 0 | -1 | | 6 | -0.428571
(1 row)
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='game' AND tablename='bag_type' ORDER BY 1 DESC;
frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
----------+-----------+---------+-----------+-----------+------------+-------+--------+-------------
1 | bag_type | game | f | 0 | -0.166667 | 1 | | 1
(1 row)
SELECT name, current_setting(name), SOURCE
FROM pg_settings
WHERE SOURCE NOT IN ('default', 'override');
name | current_setting | source
-------------------------------------+---------------------------------------------+--------------------
application_name | psql | client
archive_command | /wal-e-shim wal-push %p | configuration file
archive_mode | on | configuration file
archive_timeout | 1min | configuration file
autovacuum | on | configuration file
autovacuum_max_workers | 6 | configuration file
autovacuum_vacuum_scale_factor | 0 | configuration file
autovacuum_vacuum_threshold | 10000 | configuration file
autovacuum_work_mem | -1 | configuration file
checkpoint_completion_target | 0.9 | configuration file
checkpoint_timeout | 30min | configuration file
checkpoint_warning | 30s | configuration file
client_encoding | SQL_ASCII | client
DateStyle | ISO, MDY | configuration file
dynamic_shared_memory_type | posix | configuration file
effective_cache_size | 42432000kB | configuration file
fsync | on | configuration file
full_page_writes | on | configuration file
huge_pages | try | configuration file
idle_in_transaction_session_timeout | 10min | configuration file
lc_messages | C | configuration file
lc_monetary | C | configuration file
lc_numeric | C | configuration file
lc_time | C | configuration file
listen_addresses | * | configuration file
log_autovacuum_min_duration | 0 | configuration file
log_checkpoints | on | configuration file
log_destination | stderr | configuration file
log_line_prefix | %t [%p-%l] %q%u@%d | configuration file
log_lock_waits | on | configuration file
log_min_duration_statement | 1s | configuration file
log_temp_files | 0 | configuration file
log_timezone | UTC | configuration file
maintenance_work_mem | 3536000kB | configuration file
max_connections | 400 | configuration file
max_prepared_transactions | 100 | configuration file
max_stack_depth | 2MB | configuration file
max_wal_senders | 5 | configuration file
max_wal_size | 34GB | configuration file
pg_partman_bgw.dbname | redacted | configuration file
pg_partman_bgw.interval | 3600 | configuration file
pg_partman_bgw.role | postgres | configuration file
pg_stat_statements.track | all | configuration file
port | 5432 | command line
random_page_cost | 1.1 | configuration file
shared_buffers | 14144000kB | configuration file
shared_preload_libraries | plpgsql, pg_partman_bgw, pg_stat_statements | configuration file
stats_temp_directory | /var/run/postgresql/pg_stat_tmp | configuration file
superuser_reserved_connections | 5 | configuration file
synchronous_commit | on | configuration file
TimeZone | UTC | configuration file
unix_socket_directories | /var/run/postgresql | configuration file
unix_socket_group | postgres | configuration file
unix_socket_permissions | 0700 | configuration file
wal_keep_segments | 64 | configuration file
wal_level | replica | configuration file
wal_sync_method | fsync | configuration file
work_mem | 141440kB | configuration file
(58 rows)