> Are you using streaming replication, or WAL log shipping? We are using streaming replication. On Tue, Mar 11, 2025 at 3:41 PM harinath kanchu <kanchuharinath@xxxxxxxxx> wrote: > > Hello Mohit, > > > Do you have any status of what happened later you did that change? > > We are currently making connection from the standbys to primary for > fetching the last archived WAL file info, and it's working as > expected, we never observed any gaps in the WAL files in the archive > storage after this change. > As the primary is the only one that uploads the WAL files, we are > within the API limits of the archive storage. > > But I felt making connections every time is a big bottleneck and > sometimes you have to use superuser to get the last few connections > reserved, in case the user is bombarding the database with a lot of > connections already. > > If the info is given to standby continuously without connections, it > will be great, and one way I thought of is using the "Wal Sender Keep > Alive" messages. I am yet to test this in production, but want to > check the feedback of the community before doing so. > > Thanks, > > Best, > Harinath > > > On Tue, Mar 11, 2025 at 10:15 AM Mohit Mishra > <mohitmishra786687@xxxxxxxxx> wrote: > > > > Hello Harinath, > > > > Do you have any status of what happened later you did that change? > > > > It will help understand your context better. > > > > Br, > > Mohit > > > > On Tue, 11 Mar 2025, 07:22 harinath kanchu, <kanchuharinath@xxxxxxxxx> wrote: > >> > >> Hello Postgres Admins, > >> > >> We have a PostgreSQL setup with a primary and 2 standby nodes, with > >> synchronous_commit on and sync_standby_names = ANY 1(standby1, > >> standby2). > >> > >> Assuming we use 'archive_mode = on' on all replicas, in the event of > >> a primary failure and subsequent failover to a standby node, there is > >> a risk of the primary having failed to upload some WALs to archive > >> storage before the failover. > >> > >> On the other hand, setting 'archive_mode = always' ensures WAL > >> retention on standbys but introduces additional inefficiencies - this > >> would likely require trying to upload each WAL segment from each > >> standby, or at the very least making additional calls to check for WAL > >> presence in the archive. This can be costly and may not scale well. > >> > >> Is there a recommended and optimal way to ensure all WAL files are > >> always uploaded, even across failovers? > >> > >> Here's how we ended up addressing this: we set 'archive_mode = > >> always' on all replicas, but in our archive_command on standby nodes, > >> we connect to the primary server and query `pg_stat_archiver` to > >> determine the last successfully archived WAL file. Based on this > >> information, the we either returns an exit code 0 (if primary confirms > >> WAL was archived) or a non-zero value (primary has not, or fails to > >> respond). > >> > >> This enables us to retain all WALs on the standby that have not so far > >> been archived by the primary, and prevents redundant uploads - but it > >> involves additional communication between standbys and the primary. > >> > >> My idea for a more robust solution is to include the last > >> successfully archived WAL file information in the keep-alive messages > >> sent from the primary to standby nodes. This would allow standby > >> servers to use this data directly, eliminating the need for extra > >> queries to the primary. > >> > >> If the community finds this approach useful, I would be interested in > >> contributing a patch to implement this improvement. Please let us know > >> your thoughts! > >> > >> Thanks, > >> > >> Best, > >> Harinath > >> > >>