Logical replication stopped suddenly claiming wal_status lost when max_slot_wal_keep_size was unlimited

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After running continuously for perhaps a year or more, my project's logical replication stopped on our test DB this morning claiming wal was lost due to size limits when there aren't any limits.

The system is running Centos7 and I was planning on moving to Rhel8 and 14.12 today, but so much for that.


Is this a bug that was fixed in a later release of 14?

Is there some other setting that must be set to get the wal retained?


Here are the details:

Version:

PostgreSQL 14.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit


Log entries: (log entries that followed the last listed just continued to say the slot was invalid)

2024-08-23 03:07:45.926 UTC [1121] LOG:  starting logical decoding for slot "track_subscription"

2024-08-23 03:07:45.926 UTC [1121] DETAIL:  Streaming transactions committing after AB17/4A0C9F40, reading WAL from AB17/46D98068.

2024-08-23 03:07:45.926 UTC [1121] STATEMENT:  START_REPLICATION SLOT "track_subscription" LOGICAL AB17/554088B0 (proto_version '2', publication_names '"track_ingestion"')

2024-08-23 03:07:45.926 UTC [1121] LOG:  logical decoding found consistent point at AB17/46D98068

2024-08-23 03:07:45.926 UTC [1121] DETAIL:  There are no running transactions.

2024-08-23 03:07:45.926 UTC [1121] STATEMENT:  START_REPLICATION SLOT "track_subscription" LOGICAL AB17/554088B0 (proto_version '2', publication_names '"track_ingestion"')

2024-08-23 03:08:17.161 UTC [48799] LOG:  terminating process 1121 to release replication slot "track_subscription"

2024-08-23 03:08:17.161 UTC [1121] FATAL:  terminating connection due to administrator command

2024-08-23 03:08:17.161 UTC [1121] CONTEXT:  slot "track_subscription", output plugin "pgoutput", in the change callback, associated LSN AB17/663138F0

2024-08-23 03:08:17.161 UTC [1121] STATEMENT:  START_REPLICATION SLOT "track_subscription" LOGICAL AB17/554088B0 (proto_version '2', publication_names '"track_ingestion"')

2024-08-23 03:08:17.190 UTC [1121] LOG:  disconnection: session time: 0:00:33.502 user=sysrep database=trackdb host=postgresqldb03.s2a.nrl.navy.mil.31.250.132.in-addr.arpa port=36840

2024-08-23 03:08:17.195 UTC [48799] LOG:  invalidating slot "track_subscription" because its restart_lsn AB17/4D0E3320 exceeds max_slot_wal_keep_size


trackdb=# select * from pg_replication_slots;

     slot_name      |  plugin  | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase

--------------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------

track_subscription | pgoutput | logical   |  16386 | trackdb  | f         | f      |            |      |    130568429 |             | AB17/554088B0       | lost       |               | f

(1 row)

 

show max_slot_wal_keep_size;

max_slot_wal_keep_size

------------------------

-1

(1 row)


Thanks,


Dennis


[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux