On 12/27/23 16:31, Kaushik Iska wrote: > Hi all, > > I'm including additional details, as I am able to reproduce this issue a > little more reliably. > > Postgres Version: POSTGRES_14_9.R20230830.01_07 > Vendor: Google Cloud SQL > Logical Replication Protocol version 1 > I don't know much about Google Cloud SQL internals. Is it relatively close to Postgres (as e.g. RDS) or are the internals very different / modified for cloud environments? > Here are the logs of attempt succeeding right after it fails: > > 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- FAILS > 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres > ERROR: requested WAL segment 000000010000000600000059 has already been > removed > 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- SUCCEEDS > 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres > LOG: logical decoding found consistent point at 6/5A31F050 > > Happy to include any additional details of my setup. > I personally don't see how could this fail and then succeed, unless Google does something smart with the WAL segments under the hood. Surely we try to open the same WAL segment (given the LSN is the same), so how could it not exist and then exist? As Ron already suggested, it might be useful to see information for the replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn value). Also, maybe show the contents of pg_wal (especially for the segment referenced in the error message). Can you reproduce this outside Google cloud environment? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company