> When the secondary starts up it should continue replicating from where
> it stopped. However, it can only do this if the necessary information is
> still available. If WAL files have been deleted in the mean time. it
> can't replay them. There should be error messages in your logs on what
> went wrong
> it stopped. However, it can only do this if the necessary information is
> still available. If WAL files have been deleted in the mean time. it
> can't replay them. There should be error messages in your logs on what
> went wrong
I did another test using different wal_sender_timeout parameter, as the time of the secondary being shut down was longer than the default 60s for this parameter.
I was hoping it would help but the result was the same (records were not replicated to the secondary after the patroni start).
Well, I just verified again that the records were replicated after about 15
minutes to the secondary, so probably the timeout setting helped, or I
was not patient enough before. Is it normal to wait so long for the
replication? (the original transaction in primary took about 5 minutes and was about 3000 small records). I am providing more details for completeness below:
I get the following errors on the primary DB:
2022-04-28 04:36:50.544 EDT [13794] WARNING: archive_mode enabled, yet archive_command is not set
2022-04-28 04:37:34.893 EDT [14755] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:34.893 EDT [14755] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:34.915 EDT [14756] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:34.915 EDT [14756] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:39.925 EDT [14763] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:39.925 EDT [14763] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:44.924 EDT [14768] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:44.924 EDT [14768] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:34.893 EDT [14755] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:34.893 EDT [14755] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:34.915 EDT [14756] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:34.915 EDT [14756] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:39.925 EDT [14763] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:39.925 EDT [14763] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
2022-04-28 04:37:44.924 EDT [14768] ERROR: replication slot "xyzd3riardb05" does not exist
2022-04-28 04:37:44.924 EDT [14768] STATEMENT: START_REPLICATION SLOT "xyzd3riardb05" 0/7000000 TIMELINE 18
and after some time such errors stop to appear.
and when I execute:
su postgres -c "psql -c \"SELECT * FROM pg_replication_slots;\""
I get the following the slot seems to exist:
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_f
lush_lsn | wal_status | safe_wal_size | two_phase
---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+------------
---------+------------+---------------+-----------
xyzd3riardb05 | | physical | | | f | f | | | | 0/73289E8 |
| reserved | | f
pdc2b | | physical | | | f | f | | | | 0/726D398 |
| reserved | | f
pdc2b_standby | | physical | | | f | f | | | | 0/726D398 |
| reserved | | f
(3 rows)
lush_lsn | wal_status | safe_wal_size | two_phase
---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+------------
---------+------------+---------------+-----------
xyzd3riardb05 | | physical | | | f | f | | | | 0/73289E8 |
| reserved | | f
pdc2b | | physical | | | f | f | | | | 0/726D398 |
| reserved | | f
pdc2b_standby | | physical | | | f | f | | | | 0/726D398 |
| reserved | | f
(3 rows)
And as I said I just verified that the records were replicated after about 15 minutes to the secondary.