Hi,
I am running the same application (identical codebase) as two separate
instances to index (save) different sets of data. Both run PostgreSQL 13.
The queries are the same but the content in actual databases is
different. One database is around 1TB and the other around 300GB.
There is a problem with a paginated select query with a join and an
order. Depending on which column you order by (primary or FK) the query
is either instant or takes minutes.
So on one database, ordering by primary is instant but on the other it
is slow and vice-versa. Switching the columns around on the slow case
fixes the issue.
All relavant colums are indexed.
Simplified:
Slow: SELECT * from table1 AS t1 LEFT JOIN table2 AS t2 ON
t2.t1_id=t1.id ORDER BY t1.id ASC LIMIT 0, 10
Fast: SELECT * from table1 AS t1 LEFT JOIN table2 AS t2 ON
t2.t1_id=t1.id ORDER BY t2.t1_id ASC LIMIT 0, 10
(and the opposite, on the other instance the first one is fast and
second one is slow).
I have run all the statistic recalculations but that doesn't help. As
far as I could read the docs, there is no real way to affect the plan
other than reworking the query (I've read about fencing?) which can't be
done because it seems to be unpredictable and depends on actual data and
data quantity.
I haven't tried reindexing.
Before providing and diving into the specific query plans:
- does the planner take previous runs of the same query and it's
execution time into account? If not, why?
- assuming the query to be immutable, would it be possible for the
planner to microbenchmark a few different plans instead of trying to
estimate the cost?
As in, actually executing the query with different plans and caching the
best one.
PS: good job to the FOSDEM Devroom organisers, one of the best rooms,
was a blast.
Best regards, Cen