3) Patroni does only failovers. Also in case of regular shutdown of the primary. A failover is a promote of the standby + automatic reinstate (pg_rewind or pg_basebackup) of the former primary. Time: role site 1 - role site 2 ==================== 12:00h: primary - standby => Some clients commited some transactions; Primary stopped => Failover to standby 12:05h: standby - primary => Some clients connected + commited some transactions; Primary stopped => Failover to standby 12:10h: primary - standby Patroni.yml) $ cat pcl_l702.yml scope: pcl_l702 name: pcl_l702@tstm49003 namespace: /patroni/ log: level: DEBUG dir: /opt/app/patroni/etc/log/ file_num: 10 file_size: 104857600 restapi: listen: tstm49003.tstglobal.tst.loc:8010 connect_address: tstm49003.tstglobal.tst.loc:8010 etcd: hosts: etcdlab01.tstglobal.tst.loc:2379,etcdlab02.tstglobal.tst.loc:2379,etcdlab03.tstglobal.tst.loc:2379,etcdlab04.tstglobal.tst.loc:2379,etcdlab05.tstglobal.tst.loc:2379 username: patroni password: censored bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: true postgresql: use_pg_rewind: true use_slots: true # NO BOOTSTRAPPING USED method: do_not_bootstrap do_not_bootstrap: command: /bin/false postgresql: authentication: replication: username: repadmin password: censored superuser: username: patroni password: censored callbacks: on_reload: /opt/app/patroni/etc/callback_patroni.sh on_restart: /opt/app/patroni/etc/callback_patroni.sh on_role_change: /opt/app/patroni/etc/callback_patroni.sh on_start: /opt/app/patroni/etc/callback_patroni.sh on_stop: /opt/app/patroni/etc/callback_patroni.sh connect_address: tstm49003.tstglobal.tst.loc:5436 database: pcl_l702 data_dir: /pgdata/pcl_l702 bin_dir: /usr/pgsql-9.6/bin listen: localhost,tstm49003.tstglobal.tst.loc,pcl_l702.tstglobal.tst.loc:5436 pgpass: /home/postgres/.pgpass_patroni recovery_conf: restore_command: cp /pgxlog_archive/pcl_l702/%f %p parameters: hot_standby_feedback: on wal_keep_segments: 64 use_pg_rewind: true watchdog: mode: automatic device: /dev/watchdog safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false -----Ursprüngliche Nachricht----- Von: Adrian Klaver <adrian.klaver@xxxxxxxxxxx> Gesendet: Donnerstag, 7. November 2019 17:06 An: Zwettler Markus (OIZ) <Markus.Zwettler@xxxxxxxxxx>; pgsql-general@xxxxxxxxxxxxxxxxxxxx Betreff: Re: AW: AW: broken backup trail in case of quickly patroni switchback and forth On 11/7/19 7:47 AM, Zwettler Markus (OIZ) wrote: I am heading out the door so I will not have time to look at below until later. For those that get a chance before then, it would be nice to have the Patroni conf file information also. The Patroni information may answer the question, but it case it does not what actually is failover in 3) below? > 1) 9.6 > > > > 2) > $ cat postgresql.conf > # Do not edit this file manually! > # It will be overwritten by Patroni! > include 'postgresql.base.conf' > > cluster_name = 'pcl_l702' > hot_standby = 'on' > hot_standby_feedback = 'True' > listen_addresses = 'localhost,tstm49003.tstglobal.tst.loc,pcl_l702.tstglobal.tst.loc' > max_connections = '100' > max_locks_per_transaction = '64' > max_prepared_transactions = '0' > max_replication_slots = '10' > max_wal_senders = '10' > max_worker_processes = '8' > port = '5436' > track_commit_timestamp = 'off' > wal_keep_segments = '8' > wal_level = 'replica' > wal_log_hints = 'on' > hba_file = '/pgdata/pcl_l702/pg_hba.conf' > ident_file = '/pgdata/pcl_l702/pg_ident.conf' > $ > $ > $ > $ cat postgresql.base.conf > datestyle = 'iso, mdy' > default_text_search_config = 'pg_catalog.english' > dynamic_shared_memory_type = posix > lc_messages = 'en_US.UTF-8' > lc_monetary = 'de_CH.UTF-8' > lc_numeric = 'de_CH.UTF-8' > lc_time = 'de_CH.UTF-8' > logging_collector = on > log_directory = 'pg_log' > log_rotation_age = 1d > log_rotation_size = 0 > log_timezone = 'Europe/Vaduz' > log_truncate_on_rotation = on > max_connections = 100 > timezone = 'Europe/Vaduz' > archive_command = 'test ! -f /tmp/pg_archive_backup_running_on_pcl_l702* && rsync --checksum %p /pgxlog_archive/pcl_l702/%f' > archive_mode = on > archive_timeout = 1800 > cluster_name = pcl_l702 > cron.database_name = 'pdb_l72_oiz' > # effective_cache_size > listen_addresses = '*' > log_connections = on > log_destination = 'stderr, csvlog' > log_disconnections = on > log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' > log_line_prefix = '%t : %h=>%u@%d : %p-%c-%v : %e ' > log_statement = 'ddl' > max_wal_senders = 5 > port = 5436 > shared_buffers = 512MB > shared_preload_libraries = 'auto_explain, pg_stat_statements, pg_cron, pg_statsinfo' > wal_buffers = 16MB > wal_compression = on > wal_level = replica > # work_mem > > > > 3) > 12:00h: primary - standby > => Some clients commited some transactions; Failover > 12:05h: standby - primary > => Some clients connected + commited some transactions; Failover > 12:10h: primary - standby > > > > > > On 11/7/19 7:18 AM, Zwettler Markus (OIZ) wrote: >> I already asked the Patroni folks. They told me this is not related >> to Patroni but Postgresql. ;-) > > Hard to say without more information: > > 1) Postgres version > > 2) Setup/config info > > 3) Detail if what happened between 12:00 and 12:10 > >> >> - Markus >> >> >> >> On 11/7/19 5:52 AM, Zwettler Markus (OIZ) wrote: >>> we are using Patroni for management of our Postgres standby databases. >>> >>> we take our (wal) backups on the primary side based on intervals and thresholds. >>> our archived wal's are written to a local wal directory first and moved to tape afterwards. >>> >>> we got a case where Patroni switched back and forth sides quickly, e.g.: >>> 12:00h: primary - standby >>> 12:05h: standby - primary >>> 12:10h: primary - standby >>> >>> we realised that we will not have a wal backup of those wal's generated between 12:05h and 12:10h in this scenario. >>> >>> how can we make sure that the whole wal sequence trail will be backuped? any idea? >> >> Probably best to ask the Patroni folks: >> >> https://github.com/zalando/patroni#community >> >>> >>> - Markus >>> >>> >> >> > > -- Adrian Klaver adrian.klaver@xxxxxxxxxxx