On Sun, May 26, 2024 at 11:18 PM Torsten Förtsch <tfoertsch123@xxxxxxxxx> wrote:
Each query on the replica has a backend_xmin. You can see that in pg_stat_activity. From that backend's perspective, tuples marked as deleted by any transaction greater or equal to backend_xmin are still needed. This does not depend on the table.Now, vacuum writes to the WAL up to which point it has vacuumed on the master. In pg_waldump this looks like so:PRUNE snapshotConflictHorizon: 774, nredirected: 0, ndead: 5, nunused: 0, redirected: [], dead: [2, 4, 6, 8, 10], unused: [], blkref #0: rel 1663/5/16430 blk 0That snapshotConflictHorizon is also a transaction id. If the backend_xmin of all backends running transactions in the same database (the 5 in 1663/5/16430) -as the vacuum WAL record is greater than vacuum's snapshotConflictHorizon, then there is no conflict. If any of the backend_xmin's is less, then there is a conflict.This type of conflict is determined by just 2 numbers, the conflict horizon sent by the master in the WAL, and the minimum of all backend_xmins. For your case this means a long running transaction querying table t1 might have a backend_xmin of 223. On the master update and delete operations happen on table T2. Since all the transactions on the master are fast, when vacuum hits T2, the minimum of all backend_xmins on the master might already be 425. So, garbage left over by all transactions up to 424 can be cleaned up. Now that cleanup record reaches the replica. It compares 223>425 which is false. So, there is a conflict. Now the replica can wait until its own horizon reaches 425 or it can kill all backends with a lower backend_xmin.As I understand, hot_standby_feedback does not work for you. Not sure if you can run the query on the master? That would resolve the issues but might generate the same bloat on the master as hot_standby_feedback. Another option I can see is to run long running queries on a dedicated replica with max_standby_streaming_delay set to infinity or something large enough. If you go that way, you could also fetch the WAL from your WAL archive instead of replicating from the master. That way the replica has absolutely no chance to affect the master.
Thank you so much.
And keep the hot_standby_feedback as ON for the first standby which is used as HA/high availability. And keep the hot_standby_feedback as OFF for the second standby which is utilized for long running SELECT queries, so that primary won't be waiting for the response/feedback from this standby to vacuum its old transactions and that will keep the transaction id wrap around issue from not happening because of the Read/Select queries on any of the standby.