Hello,
My customer has a really weird planner issue on a standby.
First,
the context. There's a primary, and two standbys, all on a 11.8
release. (I know this isn't the latest release for the v11 branch.) The
cluster is 5.5TB. The PostgreSQL configuration is the same on all
servers. pg_db_role_setting is almost empty, and there's nothing
specific to planning and replication.
Here is the issue. Queries on both standbys take a lot more time than on the primary. We eventually found that the queries take a lot of time to be planned, not to be executed. For example:
On the primary:
Planning Time: 1806.574 ms
Execution Time: 771.888 ms
On any of the standbys:
Planning Time: 41566.439 ms
Execution Time: 1125.352 ms
A simple EXPLAIN already shows this difference in duration.
The
query in itself isn't really that big. 8 tables (without partitions), a
few predicates. Nothing fancy. Nothing huge to plan. Here it is, a bit
anonymized:
select ...
from t1
left outer join t2 on ...
left outer join t3 on ...
left outer join t4 on ...
left outer join t5 on ...
left outer join t6 on ...
left outer join t7 on ...
left outer join t8 on ...
where c1='...' and c2='...'
and c3>='...' and c4<='...' ;
Join conditions are really simple. There's no function called anywhere.
Plans on the three servers are exactly the same. Same nodes, same statistics, same calculations' results.
Moreover,
we've looked into what the planner was doing with strace, and we've
been surprised that it involved massive reading on tables (more than
130,000 calls to read() on 15 files). We found that most of these files
are the files for one of the tables on this query. Also, this massive
reading happens only on standbys, not on the primary.
Our customer took the time to drop and create the standbys yesterday night, but the new standbys still have the issue.
And
finally, we discovered that promoting a standby brings back the old
performance. We have the same performance between the primary and the
promoted standby.
To be honest, I'm lost, I don't know what to do next. But I have some questions :)
*
The planner seems to read tables to generate the plans and/or select
the right plan. Why does it do that? I thought it only reads indexes if
needed, but it is quite clear it reads tables also.
* How can
the replication have an impact on the planner ? (I'm not sure I'm
asking the right question... I mean, why do I see a difference in
behaviour between a standby and an old standby, new autonomous server?)
* Do you have any idea on how to fix this?
Thank you.
Regards.
--
Guillaume.