At Wed, 28 Sep 2022 08:50:12 +0000, "Lahnov, Igor" <Igor.Lahnov@xxxxxxxxxx> wrote in > Hi, > After failover all stand by nodes could not start streaming wal recovery. > Streaming recovery start from 1473/A5000000, but standby start at 1473/A5FFEE08, this seems to be the problem. It's not a problem at all. It is quite normal for standby to start streaming from the beginning of a WAL segment. > What can we do in this case to restore? > Is it possible to shift wal streaming recovery point on primary? > Can checkpoint on primary help in this situation? > 2022-09-26 14:08:23.672 [3747868] LOG: started streaming WAL from primary at 1473/A5000000 on timeline 18 > 2022-09-26 14:08:24.363 [3747796] LOG: invalid record length at 1473/A5FFEE08: wanted 24, got 0 > 2022-09-26 14:08:24.366 [3747868] FATAL: terminating walreceiver process due to administrator command This seems to mean someone emtpied primary_conninfo. > 2022-09-26 14:08:24.366 [3747796] LOG: invalid record length at 1473/A5FFEE08: wanted 24, got 0 > 2022-09-26 14:08:24.366 [3747796] LOG: invalid record length at 1473/A5FFEE08: wanted 24, got 0 I don't fully understand the situation. A situation that leads the this state I can come up with is that somehow the standby restored an incomplete WAL segment from the primary. For example, in a case wheresomeone copied the current active WAL file from pg_wal to archive on the primary, or a case where restore_command on the standby fetches WAL files from pg_wal on the primary instead of its archive. Both are not normal operations. regards. -- Kyotaro Horiguchi NTT Open Source Software Center