Any advice on a different mailing list that something like this would be more suited to?
Regards,
Koen De Groote
On Fri, Jan 31, 2025 at 8:38 PM Koen De Groote <kdg.dev@xxxxxxxxx> wrote:
No, it's meant to be an off-site restore, as to do a daily check if the restore actually works.Regards,Koen De GrooteOn Fri, Jan 31, 2025 at 2:30 PM Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:On Fri, 2025-01-31 at 10:47 +0100, Koen De Groote wrote:
> I'm running postgres 16.6
>
> My backup strategy is: basebackup and WAL archive. These get uploaded to the cloud.
>
> The restore is on an isolated machine and is performed daily. It downloads the
> basebackup, unpacks it, sets a recovery.signal, and a script is provided as
> restore_command, to download the WAL archives %f and unpack them into %p
>
> In the script, the final unpacking is simply "gzip -dc %f > %p". The gz files
> are first checked with "gzip -t".
>
> If a WAL archive is asked that doesn't exist yet, the script naturally cannot
> find it, and exits with status code 1. This is the end of the recovery.
>
> There are a few tables that are known to receive new entries multiple times
> per day. However, the state of the recovery showed the latest item to be 2
> days in the past. Checking the live DB, there are an expected amount of items
> since that ID.
>
> I checked the logs, the last WAL archive that got downloaded is indeed the
> last one that was available. The one that failed to download on the restore
> machine, was uploaded to the cloud 8 minutes later, according to the upload
> logs on the live DB.
>
> The postgres logs themselves seem perfectly normal. It logs all these WAL
> recoveries, switches the timeline, and becomes available.
>
> What could be going wrong? My main issue is that I don't know where to start
> looking, since nothing in the logs seems abnormal.
I don't know, that all sounds like it is working as it should.
If the last WAL archive that got downloaded by the "restore_command" is indeed
the last one that was available, recovery did just what it is supposed to.
If new WAL segments get archived later, that's too late.
Perhaps you are looking for replication, not for restoring a backup, which is
necessarily not totally up to date.
Yours,
Laurenz Albe