On 2021-Aug-16, Scott Ribe wrote: > If I use max_slot_wal_keep_size to limit disk impact of a down > replica, and subsequently a down replica causes PG to hit this limit, > is there a particular message that will be logged when the limit is > crossed and PG starts to purge WAL? > > Context is: trying to debug a failure to bring up a replica, where the > failure happened in the middle of a moderately complex chain of events > that likely started with a bad disk. (Patroni is involved, FWIW) Yes, you should see invalidating slot "..." because its restart_lsn ... exceeds max_slot_wal_keep_size However, there was a bug fixed recently in that area, whereby the slot would be invalidated but the space would not be freed; the fix was on July 16th and it was released together with last week's minors: Author: Alvaro Herrera <alvherre@xxxxxxxxxxxxxx> Branch: master [ead9e51e8] 2021-07-16 12:07:30 -0400 Branch: REL_14_STABLE [e5bcbb107] 2021-07-16 12:07:30 -0400 Branch: REL_13_STABLE Release: REL_13_4 [866237a6f] 2021-07-16 12:07:30 -0400 Advance old-segment horizon properly after slot invalidation When some slots are invalidated due to the max_slot_wal_keep_size limit, the old segment horizon should move forward to stay within the limit. However, in commit c6550776394e we forgot to call KeepLogSeg again to recompute the horizon after invalidating replication slots. In cases where other slots remained, the limits would be recomputed eventually for other reasons, but if all slots were invalidated, the limits would not move at all afterwards. Repair. Backpatch to 13 where the feature was introduced. Author: Kyotaro Horiguchi <horikyota.ntt@xxxxxxxxx> Reported-by: Marcin Krupowicz <mk@xxxxxxx> Discussion: https://postgr.es/m/17103-004130e8f27782c9@xxxxxxxxxxxxxx -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/