Hi, folks; I’m in the process of testing PG v12.7 hot streaming standby’s in async mode, and specifically in the area of reliable switchover and switchback operations. I have created test primary’s and standby’s successfully with pg_ctl initdb and pg_basebackup respectively. I can switch over/switch roles (standby -> primary, primary -> standby) successfully one time. When I switch roles a second time, the standby becomes primary just fine but the primary to standby fails to open properly. Logfile messages for both primary and standby are attached as well as pg_wal and archive directory lists. I shut down the primary first and then promote the standby. The only action I take on the standby side is to create the standby.signal file in $PGDATA. (Well almost. Initially, I do have to have an empty 00000000.history file in the archive locations or the standby’s fail to start.) My archive directories are separate from each other and the archive command is the given test … cp -p example in the docs. archive_mode is ‘on’ on both sides. Are there other undocumented manual steps I should be completing before starting the new standby? Many thanks! |
2022-05-18 09:52:52.820 EDT [55785] LOG: starting PostgreSQL 12.7 on x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit 2022-05-18 09:52:52.833 EDT [55785] LOG: listening on IPv6 address "::1", port 5432 2022-05-18 09:52:52.833 EDT [55785] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-05-18 09:52:52.840 EDT [55785] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-05-18 09:52:53.088 EDT [55786] LOG: database system was shut down at 2022-05-18 09:52:47 EDT 2022-05-18 09:52:54.543 EDT [55785] LOG: database system is ready to accept connections 2022-05-18 09:56:52.618 EDT [55785] LOG: received fast shutdown request 2022-05-18 09:56:52.623 EDT [55785] LOG: aborting any active transactions 2022-05-18 09:56:52.639 EDT [55785] LOG: background worker "logical replication launcher" (PID 55793) exited with exit code 1 2022-05-18 09:56:52.656 EDT [55787] LOG: shutting down 2022-05-18 09:56:52.841 EDT [55785] LOG: database system is shut down 2022-05-18 09:57:35.741 EDT [55861] LOG: starting PostgreSQL 12.7 on x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit 2022-05-18 09:57:35.756 EDT [55861] LOG: listening on IPv6 address "::1", port 5432 2022-05-18 09:57:35.756 EDT [55861] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-05-18 09:57:35.764 EDT [55861] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-05-18 09:57:35.980 EDT [55862] LOG: database system was shut down at 2022-05-18 09:56:52 EDT cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or directory 2022-05-18 09:57:36.093 EDT [55862] LOG: entering standby mode cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000004': No such file or directory 2022-05-18 09:57:36.232 EDT [55862] LOG: consistent recovery state reached at 0/40000A0 2022-05-18 09:57:36.232 EDT [55862] LOG: invalid record length at 0/40000A0: wanted 24, got 0 2022-05-18 09:57:36.859 EDT [55861] LOG: database system is ready to accept read only connections 2022-05-18 09:57:37.298 EDT [55868] LOG: started streaming WAL from primary at 0/4000000 on timeline 1 2022-05-18 09:57:37.301 EDT [55862] LOG: redo starts at 0/40000A0 2022-05-18 09:59:14.673 EDT [55868] LOG: replication terminated by primary server 2022-05-18 09:59:14.673 EDT [55868] DETAIL: End of WAL reached on timeline 1 at 0/50000A0. 2022-05-18 09:59:14.673 EDT [55868] FATAL: could not send end-of-streaming message to primary: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. no COPY in progress cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or directory cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such file or directory 2022-05-18 09:59:14.902 EDT [55862] LOG: invalid record length at 0/50000A0: wanted 24, got 0 2022-05-18 09:59:17.107 EDT [55881] FATAL: could not connect to the primary server: connection to server at "127.0.0.1", port 5433 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or directory cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such file or directory 2022-05-18 09:59:17.396 EDT [55862] LOG: received promote request 2022-05-18 09:59:17.396 EDT [55862] LOG: redo done at 0/5000028 2022-05-18 09:59:17.396 EDT [55862] LOG: last completed transaction was at log time 2022-05-18 09:58:23.71503-04 cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such file or directory cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or directory 2022-05-18 09:59:17.659 EDT [55862] LOG: selected new timeline ID: 2 2022-05-18 09:59:17.701 EDT [55862] LOG: archive recovery complete cp: cannot stat '/home/itman/pgdata/archive1/00000001.history': No such file or directory 2022-05-18 09:59:18.765 EDT [55861] LOG: database system is ready to accept connections
2022-05-18 09:53:35.044 EDT [55811] LOG: starting PostgreSQL 12.7 on x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit 2022-05-18 09:53:35.058 EDT [55811] LOG: listening on IPv6 address "::1", port 5433 2022-05-18 09:53:35.058 EDT [55811] LOG: listening on IPv4 address "127.0.0.1", port 5433 2022-05-18 09:53:35.066 EDT [55811] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-05-18 09:53:35.264 EDT [55812] LOG: database system was interrupted; last known up at 2022-05-18 09:53:18 EDT cp: cannot stat '/home/itman/pgdata/archive2/00000000.history': No such file or directory 2022-05-18 09:53:36.007 EDT [55812] FATAL: recovery target timeline 0 does not exist 2022-05-18 09:53:36.023 EDT [55811] LOG: startup process (PID 55812) exited with exit code 1 2022-05-18 09:53:36.023 EDT [55811] LOG: aborting startup due to startup process failure 2022-05-18 09:53:36.032 EDT [55811] LOG: database system is shut down 2022-05-18 09:54:49.267 EDT [55821] LOG: starting PostgreSQL 12.7 on x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit 2022-05-18 09:54:49.282 EDT [55821] LOG: listening on IPv6 address "::1", port 5433 2022-05-18 09:54:49.282 EDT [55821] LOG: listening on IPv4 address "127.0.0.1", port 5433 2022-05-18 09:54:49.289 EDT [55821] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-05-18 09:54:49.551 EDT [55822] LOG: database system was interrupted; last known up at 2022-05-18 09:53:18 EDT 2022-05-18 09:54:50.281 EDT [55822] LOG: restored log file "00000000.history" from archive 2022-05-18 09:54:50.281 EDT [55822] LOG: entering standby mode 2022-05-18 09:54:50.408 EDT [55822] LOG: restored log file "00000000.history" from archive cp: cannot stat '/home/itman/pgdata/archive2/000000000000000000000002': No such file or directory 2022-05-18 09:54:50.940 EDT [55826] LOG: started streaming WAL from primary at 0/2000000 on timeline 1 2022-05-18 09:54:50.982 EDT [55822] LOG: redo starts at 0/2000028 2022-05-18 09:54:50.983 EDT [55822] LOG: consistent recovery state reached at 0/2000100 2022-05-18 09:54:51.697 EDT [55821] LOG: database system is ready to accept read only connections 2022-05-18 09:56:52.819 EDT [55826] LOG: replication terminated by primary server 2022-05-18 09:56:52.819 EDT [55826] DETAIL: End of WAL reached on timeline 1 at 0/40000A0. 2022-05-18 09:56:52.819 EDT [55826] FATAL: could not send end-of-streaming message to primary: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. no COPY in progress cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such file or directory 2022-05-18 09:56:52.920 EDT [55822] LOG: invalid record length at 0/40000A0: wanted 24, got 0 2022-05-18 09:56:54.461 EDT [55822] LOG: received promote request 2022-05-18 09:56:54.462 EDT [55846] FATAL: terminating walreceiver process due to administrator command cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such file or directory 2022-05-18 09:56:54.602 EDT [55822] LOG: redo done at 0/4000028 2022-05-18 09:56:54.602 EDT [55822] LOG: last completed transaction was at log time 2022-05-18 09:55:26.332136-04 cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such file or directory 2022-05-18 09:56:54.745 EDT [55822] LOG: selected new timeline ID: 1 2022-05-18 09:56:54.800 EDT [55822] LOG: archive recovery complete 2022-05-18 09:56:54.933 EDT [55822] LOG: restored log file "00000000.history" from archive 2022-05-18 09:56:55.910 EDT [55821] LOG: database system is ready to accept connections 2022-05-18 09:59:14.514 EDT [55821] LOG: received fast shutdown request 2022-05-18 09:59:14.519 EDT [55821] LOG: aborting any active transactions 2022-05-18 09:59:14.535 EDT [55821] LOG: background worker "logical replication launcher" (PID 55856) exited with exit code 1 2022-05-18 09:59:14.536 EDT [55828] LOG: shutting down 2022-05-18 09:59:14.701 EDT [55821] LOG: database system is shut down 2022-05-18 10:00:31.541 EDT [55985] LOG: starting PostgreSQL 12.7 on x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit 2022-05-18 10:00:31.552 EDT [55985] LOG: listening on IPv6 address "::1", port 5433 2022-05-18 10:00:31.552 EDT [55985] LOG: listening on IPv4 address "127.0.0.1", port 5433 2022-05-18 10:00:31.559 EDT [55985] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-05-18 10:00:31.753 EDT [55986] LOG: database system was shut down at 2022-05-18 09:59:14 EDT 2022-05-18 10:00:31.872 EDT [55986] LOG: restored log file "00000000.history" from archive 2022-05-18 10:00:31.872 EDT [55986] LOG: entering standby mode 2022-05-18 10:00:32.001 EDT [55986] LOG: restored log file "00000000.history" from archive cp: cannot stat '/home/itman/pgdata/archive2/000000000000000000000005': No such file or directory 2022-05-18 10:00:32.550 EDT [55990] LOG: fetching timeline history file for timeline 2 from primary server 2022-05-18 10:00:32.568 EDT [55990] LOG: started streaming WAL from primary at 0/5000000 on timeline 1 2022-05-18 10:00:32.568 EDT [55990] LOG: replication terminated by primary server 2022-05-18 10:00:32.568 EDT [55990] DETAIL: End of WAL reached on timeline 1 at 0/50000A0. 2022-05-18 10:00:32.580 EDT [55986] LOG: consistent recovery state reached at 0/50000A0 cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such file or directory 2022-05-18 10:00:32.760 EDT [55986] LOG: invalid record length at 0/50000A0: wanted 24, got 0 2022-05-18 10:00:32.761 EDT [55990] LOG: restarted WAL streaming at 0/5000000 on timeline 1 2022-05-18 10:00:32.762 EDT [55990] LOG: replication terminated by primary server 2022-05-18 10:00:32.762 EDT [55990] DETAIL: End of WAL reached on timeline 1 at 0/50000A0. 2022-05-18 10:00:33.300 EDT [55985] LOG: database system is ready to accept read only connections cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such file or directory 2022-05-18 10:00:37.713 EDT [55990] LOG: restarted WAL streaming at 0/5000000 on timeline 1 2022-05-18 10:00:37.713 EDT [55990] LOG: replication terminated by primary server 2022-05-18 10:00:37.713 EDT [55990] DETAIL: End of WAL reached on timeline 1 at 0/50000A0. cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such file or directory 2022-05-18 10:00:42.739 EDT [55990] LOG: restarted WAL streaming at 0/5000000 on timeline 1 2022-05-18 10:00:42.739 EDT [55990] LOG: replication terminated by primary server 2022-05-18 10:00:42.739 EDT [55990] DETAIL: End of WAL reached on timeline 1 at 0/50000A0. 2022-05-18 10:00:45.480 EDT [55985] LOG: received fast shutdown request 2022-05-18 10:00:45.484 EDT [55985] LOG: aborting any active transactions 2022-05-18 10:00:45.486 EDT [55990] FATAL: terminating walreceiver process due to administrator command 2022-05-18 10:00:45.507 EDT [55993] LOG: shutting down 2022-05-18 10:00:45.581 EDT [55985] LOG: database system is shut down
total 98310 -rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000001 -rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000002 -rw------- 1 itman itman 337 May 18 09:53 000000010000000000000002.00000028.backup -rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000003 -rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000004 -rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000005.partial -rw------- 1 itman itman 41 May 18 09:59 00000002.history drwx------+ 1 itman itman 0 May 18 09:59 archive_status -rw------- 1 itman itman 16777216 May 18 09:59 000000020000000000000005
total 81926 -rw------- 1 itman itman 16777216 May 18 09:54 000000010000000000000002 -rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000003 -rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000004.partial -rw------- 1 itman itman 42 May 18 09:56 00000001.history -rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000004 -rw------- 1 itman itman 0 May 18 10:00 00000000.history -rw------- 1 itman itman 41 May 18 10:00 00000002.history drwx------ 1 itman itman 0 May 18 10:00 archive_status -rw------- 1 itman itman 16777216 May 18 10:00 000000010000000000000005
total 65538 -rw-r--r-- 1 itman itman 0 May 18 09:54 00000000.history -rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000001 -rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000002 -rw------- 1 itman itman 337 May 18 09:53 000000010000000000000002.00000028.backup -rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000003 -rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000005.partial -rw------- 1 itman itman 41 May 18 09:59 00000002.history
total 32769 -rw-r--r-- 1 itman itman 0 May 18 09:53 00000000.history -rw------- 1 itman itman 42 May 18 09:56 00000001.history -rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000004 -rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000004.partial