On Thu, Jul 25, 2024 at 7:57 PM Fernando Hevia <fhevia@xxxxxxxxx> wrote:
Hi Wasim,I think you might have misinterpreted the explanation given to you. The cancellation of the query on the standby server isn't related to the load on the primary server. It happens that when you run queries on a hot standby, the replication is temporarily paused in order to not modify data the running queries on the standby server need. Once the queries end, replication resumes.The problem of this behaviour is that the standby server starts to fall behind in relation to the master, a scenario which presents a risky condition: if the master happens to fail while the replica is delayed you end up with data loss.To avoid having a standby server lagging too far behind Postgres will cancel long running queries on the replica. The parameter max_standby_streaming_delay defines the maximum replication delay the standby will tolerate. Default is 30 seconds. Increase the value to allow for longer running queries on the standby server bearing in mind that you could end up with data loss if the master fails at the wrong moment.A working alternative is to have one standby server exclusively for replication purposes and another standby for reporting/read-only queries where you can increase the max_standby_streaming_delay to accommodate your long running queries. Of course, this will require additional computing and storage resources.Cheers,Fernando.
This is all true, but the hot_standby_feedback option is the way to get
around needing to worry about replication delay all together. As far as
how it affects VACUUM, it's no different to how running those same
queries on the primary would affect it. The reason I mention it is that
people think that moving queries to the replica takes away all the
effects of running them on the primary. It takes away the load of the
query, but there are side effects that still have to be managed. Either of the options mentioned are fine to do as long as you know
the consequences of them.