I’m working on diagnosing an issue with a complex query on a large PG11 database we have. The planner is choosing a plan that takes 60+ seconds but if we force it to use a particular index then it takes only a couple of seconds. I’ve narrowed down what I think is the cause to a very simple join for which PG is underestimating the rows by a factor of 20. It then chooses to do a nested loop which is actually much slower than it thinks it would be based on the estimate.
I’ve managed to reproduce this underestimation issue using a simple data set that is very similar in distribution to what we have in our real database:
CREATE TABLE test_a (
a_id VARCHAR(255) PRIMARY KEY,
group_val VARCHAR(255) NOT NULL
);
INSERT INTO test_a (a_id, group_val) SELECT 'a_id'||s, 'group'||s FROM generate_series(1, 100) s;
INSERT INTO test_a (a_id, group_val) VALUES ('a_id101', 'group1');
CREATE TABLE test_b (
b_id VARCHAR(255) PRIMARY KEY,
a_id VARCHAR(255) NOT NULL,
CONSTRAINT test_bk_fk01 FOREIGN KEY (a_id) REFERENCES test_a (a_id)
);
INSERT INTO test_b (b_id, a_id) SELECT 'b_id'||s, 'a_id1' FROM generate_series(1, 1500) s;
INSERT INTO test_b (b_id, a_id) SELECT 'b_id'||s, 'a_id2' FROM generate_series(1501, 2000) s;
ANALYZE test_a;
ANALYZE test_b;
EXPLAIN ANALYZE SELECT * FROM test_b b JOIN test_a a ON a.a_id = b.a_id WHERE a.group_val = 'group1';
This gives an estimated 40 rows for the join when in fact there are 1500 rows. The crux seems to be that test_b does not have an even distribution for a_id values: it only has records for two of the values in the referenced table. This is how our real dataset is too and isn’t something we can really change.
I’ve read through the row estimation information here: https://www.postgresql.org/docs/11/row-estimation-examples.html and also read through the source code for the eqjoinsel_inner function. I can see how it’s calculating the estimate but I have no clue how it can be made accurate for this (seemingly) simple scenario.
Any hints or tips would be greatly appreciated.