Il giorno mer 24 ago 2022 alle ore 13:00 Laurenz Albe <laurenz.albe@xxxxxxxxxxx> ha scritto:
On Wed, 2022-08-24 at 14:18 +0900, Kyotaro Horiguchi wrote:
> At Fri, 19 Aug 2022 18:37:53 +0200, Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote in
> > On Fri, 2022-08-19 at 16:54 +0200, Giovanni Biscontini wrote:
> > > Hello everyone,
> > > I'm experiencing a behaviour I don't really understand if is a misconfiguration or a wanted behaviour:
> > > 1) I set up a primary server (a.k.a. db1) with and archive_command to a storage
> > > 2) I set up a replica (a.k.a. db2) that created a slot named as slot_2 and that has the recovery_command
> > > set to read archived wal on the storage.
> > > If I shutdown replica db2 during a pgbench I see the safe_wal_size queried from pg_replication_slots
> > > on the primary decrease to a certain amount but still in the max_slot_wal_kepp_size window: even
> > > if I restart the replica db2 before the slot_state changes to unreserved or lost I see that the
> > > replica gets needed wals from the storage using recovery_command but doesn't use slot on primary.
> > > Only if I comment the recovery command on the .conf of the replica then it uses slot.
> > > If this is a wanted behaviour I can't understand the need of slots on primary.
> >
> > This is normal behavior and is no problem.
> >
> > After the standby has caught up using "restore_command", it will connection to
> > the primary as defined in "primary_conninfo" and stream WAL from there.
>
> The reason that db2 ran recovery beyond the slot LSN is the db2's
> restore_command (I guess) points to db1's archive. If db2 had its own
> archive directory or no archive (that is, restore_command is empty),
> archive recovery stops at (approximately) the slot LSN and replication
> will start from there (from the beginning of the segment, to be
> exact).
Is it a problem if archive recovery proceeds past the replication slot's LSN?
I guess I don't see the problem.
Yours,
Laurenz Albe
Hi and thanks all, my thoughts:
a) if I set up a slot I thought it would be useful for 2 reason:
a.1) it has a "per replica" reference on the wal to keep,
a) if I set up a slot I thought it would be useful for 2 reason:
a.1) it has a "per replica" reference on the wal to keep,
a.2) after a disconnection in replica (db2) when it reconnects I think it can be quicker to get missing WALs referenced in slot from the primary pg_wal than recover them from archived, especially if archived are on an a S3 bucket (so yes db2 recovery points to the same archive of db1)
b) Archive and consequently the recovery command in my thoughts are "the safety" if replica falls behind the wal_keep_size or (in this case) behind the max_slot_wal_keep_size
c) I understand that, maybe, the idea behind giving the precedence to to recovery_command is "recovery is present, so don't even give a try to slot because it can be lost so go to "safety" with recovery that is intended to be.
but... in this case if I set a slot+a recovery_command the usage and subsequently the risk of filling the disk space, is useless: it uses always the recovery.
So if I can say the problem is: I configure a slot that in every case produces more time to set it up, more disk usage, more configuration, but is useless...
thanks in advance and best regard, Giovanni
p.s. I forgot to specify before: the pg version is 14.5