Hi I have setup a master/standby on PostgreSQL95 on two test servers and trialing out repmgr. (https://github.com/2ndQuadrant/repmgr/)
I am testing a switchover using the following: -bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf -C /etc/repmgr/9.5/repmgr.conf standby switchover -L DEBUG -v The switchover appears to hang at the last part of the switchover process…. NOTICE: restarting server using '/usr/pgsql-9.5/bin/pg_ctl -w -D /var/lib/pgsql/9.5/data -m fast restart' pg_ctl: PID file "/var/lib/pgsql/9.5/data/postmaster.pid" does not exist Is server running? starting server anyway It appears to have worked though as when I run the cluster show command on both servers it showing the switchover. -bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf cluster show Role | Name | Upstream | Connection String ----------+----------------|----------------|------------------------------------------- * master | itupl-postgen2 | | host=10.70.3.252 dbname=repmgr user=repmgr standby | itupl-postgen1 | itupl-postgen2 | host=10.70.3.251 dbname=repmgr user=repmgr It is also showing correctly in repl_nodes table of the two databases.
Why is it hanging?? Thank you for your help… Here is the complete output: -----------------------------------------------
-bash-4.1$ repmgr -f /etc/repmgr/9.5/repmgr.conf -C /etc/repmgr/9.5/repmgr.conf standby switchover -L DEBUG -v NOTICE: using configuration file "/etc/repmgr/9.5/repmgr.conf" NOTICE: switching current node 2 to master server and demoting current master to standby... DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr fallback_application_name='repmgr'' DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() INFO: retrieving node list for cluster 'repmgr_cluster' DEBUG: get_master_connection(): SELECT id, conninfo, CASE WHEN type = 'master' THEN 1 ELSE 2 END AS type_priority FROM "repmgr_repmgr_cluster".repl_nodes WHERE cluster = 'repmgr_cluster' AND type != 'witness' ORDER BY
active DESC, type_priority, priority, id INFO: checking role of cluster node '1' DEBUG: connecting to: 'host=10.70.3.251 dbname=repmgr user=repmgr fallback_application_name='repmgr'' DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() DEBUG: get_master_connection(): current master node is 1 DEBUG: get_node_record(): SELECT id, type, upstream_node_id, name, conninfo, slot_name, priority, active FROM "repmgr_repmgr_cluster".repl_nodes WHERE cluster = 'repmgr_cluster' AND id = 1 DEBUG: remote node name is "itupl-postgen1" DEBUG: test_ssh_connection(): executing ssh -o Batchmode=yes 10.70.3.251 /bin/true 2>/dev/null DEBUG: get_pg_setting(): SELECT name, setting FROM pg_catalog.pg_settings WHERE name = 'data_directory' DEBUG: get_pg_setting(): returned value is "/var/lib/pgsql/9.5/data" DEBUG: master's data directory is: /var/lib/pgsql/9.5/data DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 ls '/var/lib/pgsql/9.5/data/PG_VERSION' >/dev/null 2>&1 && echo 1 || echo 0 DEBUG: remote_command(): output returned was: 1 DEBUG: PG_VERSION found in /var/lib/pgsql/9.5/data DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 ls '/usr/pgsql-9.5/bin/pg_rewind' >/dev/null 2>&1 && echo 1 || echo 0 DEBUG: remote_command(): output returned was: 1 DEBUG: guc_set(): SELECT true FROM pg_catalog.pg_settings WHERE name = 'full_page_writes' AND setting = 'off' DEBUG: guc_set(): SELECT true FROM pg_catalog.pg_settings WHERE name = 'wal_log_hints' AND setting = 'on' INFO: looking for file "/etc/repmgr/9.5/repmgr.conf" on remote server "10.70.3.251" DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 ls '/etc/repmgr/9.5/repmgr.conf' >/dev/null 2>&1 && echo 1 || echo 0 DEBUG: remote_command(): output returned was: 1 INFO: remote configuration file "/etc/repmgr/9.5/repmgr.conf" found on remote server DEBUG: remote_archive_config_dir: /tmp/repmgr-itupl-postgen1-archive DEBUG: Executing: /usr/pgsql-9.5/bin/repmgr standby archive-config -f '/etc/repmgr/9.5/repmgr.conf' --config-archive-dir='/tmp/repmgr-itupl-postgen1-archive' DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 /usr/pgsql-9.5/bin/repmgr standby archive-config -f '/etc/repmgr/9.5/repmgr.conf' --config-archive-dir='/tmp/repmgr-itupl-postgen1-archive' WARNING: nonstandard use of escape in a string literal LINE 1: ...config_file, regexp_replace(config_file, '^.*\/',''... ^ HINT: Use the escape string syntax for escapes, e.g., E'\r\n'. NOTICE: 3 files copied to /tmp/repmgr-itupl-postgen1-archive DEBUG: remote_command(): output returned was: DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 /usr/pgsql-9.5/bin/pg_ctl -D '/var/lib/pgsql/9.5/data' -m fast -W stop >/dev/null 2>&1 && echo 1 || echo 0 DEBUG: remote_command(): output returned was: 1 DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 ls '/var/lib/pgsql/9.5/data/postmaster.pid' >/dev/null 2>&1 && echo 1 || echo 0 DEBUG: remote_command(): output returned was: 0 NOTICE: current master has been stopped INFO: connecting to standby database DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr fallback_application_name='repmgr'' INFO: connected to standby, checking its state DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() INFO: retrieving node list for cluster 'repmgr_cluster' DEBUG: get_master_connection(): SELECT id, conninfo, CASE WHEN type = 'master' THEN 1 ELSE 2 END AS type_priority FROM "repmgr_repmgr_cluster".repl_nodes WHERE cluster = 'repmgr_cluster' AND type != 'witness' ORDER BY
active DESC, type_priority, priority, id INFO: checking role of cluster node '1' DEBUG: connecting to: 'host=10.70.3.251 dbname=repmgr user=repmgr fallback_application_name='repmgr'' ERROR: connection to database failed: could not connect to server: Connection refused Is the server running on host "10.70.3.251" and accepting TCP/IP connections on port 5432? INFO: checking role of cluster node '2' DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr fallback_application_name='repmgr'' DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() NOTICE: promoting standby DEBUG: get_pg_setting(): SELECT name, setting FROM pg_catalog.pg_settings WHERE name = 'data_directory' DEBUG: get_pg_setting(): returned value is "/var/lib/pgsql/9.5/data" NOTICE: promoting server using '/usr/pgsql-9.5/bin/pg_ctl -D /var/lib/pgsql/9.5/data promote' server promoting INFO: reconnecting to promoted server DEBUG: connecting to: 'host=10.70.3.252 dbname=repmgr user=repmgr fallback_application_name='repmgr'' DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() DEBUG: is_standby(): SELECT pg_catalog.pg_is_in_recovery() DEBUG: setting node 2 as master and marking existing master as failed DEBUG: begin_transaction() DEBUG: commit_transaction() NOTICE: STANDBY PROMOTE successful DEBUG: create_event_record(): INSERT INTO "repmgr_repmgr_cluster".repl_events ( node_id, event, successful, details ) VALUES ($1, $2, $3, $4) RETURNING event_timestamp DEBUG: create_event_record(): Event timestamp is "2017-05-22 16:56:06.860066+09:30" NOTICE: Executing pg_rewind on old master server DEBUG: pg_rewind command is: '/usr/pgsql-9.5/bin/pg_rewind' -D '/var/lib/pgsql/9.5/data' --source-server=\'host=10.70.3.252 dbname=repmgr user=repmgr\' DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 '/usr/pgsql-9.5/bin/pg_rewind' -D '/var/lib/pgsql/9.5/data' --source-server=\'host=10.70.3.252 dbname=repmgr user=repmgr\' DEBUG: remote_command(): output returned was: servers diverged at WAL position 1/1D000098 on timeline 11 no rewind required DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 /usr/pgsql-9.5/bin/repmgr standby restore-config -D '/var/lib/pgsql/9.5/data' --config-archive-dir='/tmp/repmgr-itupl-postgen1-archive' ERROR: unable to determine cluster name - please provide a valid configuration file with -c/--config-file HINT: Use -F/--force to continue anyway DEBUG: remote_command(): output returned was: DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 test -e '/var/lib/pgsql/9.5/data/recovery.done' && rm -f '/var/lib/pgsql/9.5/data/recovery.done' DEBUG: remote_command(): output returned was: DEBUG: Executing: /usr/pgsql-9.5/bin/repmgr -D '/var/lib/pgsql/9.5/data' -f '/etc/repmgr/9.5/repmgr.conf' -h 10.70.3.252 -d repmgr -U repmgr standby follow DEBUG: remote_command(): ssh -o Batchmode=yes 10.70.3.251 /usr/pgsql-9.5/bin/repmgr -D '/var/lib/pgsql/9.5/data' -f '/etc/repmgr/9.5/repmgr.conf' -h 10.70.3.252 -d repmgr -U repmgr standby follow NOTICE: restarting server using '/usr/pgsql-9.5/bin/pg_ctl -w -D /var/lib/pgsql/9.5/data -m fast restart' pg_ctl: PID file "/var/lib/pgsql/9.5/data/postmaster.pid" does not exist Is server running? starting server anyway Regards Dylan |