Hello,
Have Streaming replica with 2 hosts, master and slave with PostgreSQL 10.6. Replica works fine and data from master copying to slave without any problem,
but on slave have a problem: wal-files in catalog ${PGDATA}/pg_wal is not removing authomaticaly and may be collects over some years.
Both, master (1.1.1.1) and slave (2.2.2.2), have simmilar replication slots:
master=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
standby_slot | | physical | | | f | t | 1184 | | | 8/1C16C508 |
(1 row)
slave=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
standby_slot | | physical | | | f | t | 1184 | | | 8/1C16C508 |
(1 row)
which seted in ${PGDATA}/recovery.done on master:
standby_mode = 'on'
primary_conninfo = 'user=pgadmin host=1.1.1.1 port=5432 user=pgadmin sslmode=prefer sslcompression=1'
recovery_target_timeline = 'latest'
trigger_file = 'failover'
restore_command = 'cp /opt/archivelog/%f %p'
archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /opt/archivelog %r'
primary_slot_name = 'standby_slot'
and seted in ${PGDATA}/recovery.conf on slave:
standby_mode = 'on'
primary_conninfo = 'user=pgadmin host=2.2.2.2 port=5432 user=pgadmin sslmode=prefer sslcompression=1'
recovery_target_timeline = 'latest'
trigger_file = 'failover'
restore_command = 'cp /opt/archivelog/%f %p'
archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /opt/archivelog %r'
primary_slot_name = 'standby_slot'
Both archive_cleanup_command's on both hosts works fine in console and no needs to set permissions on /opt/archivelog or /usr/pgsql-10/bin/pg_archivecleanup.
In ${PGDATA}/postgresql.conf on both hosts sets:
max_wal_size = 8GB
wal_keep_segments = 32
But yesterday pg_wal was 16G and had 1018 files inside, thats why there was no other way except deleting its manualy, but not of all,
just 6 hundreds files, which had modification time before September 2020 except 000000010000000000000034.00000028.backup.
Trying to rebuild slave from master with: making pg_start_backup, copying ${PGDATA} files from master to slave (without removing on slave any files in ${PGDATA}),
pg_stop_backup, tunning settings on slave and run it again had no any success, wal-files still is not removing automaticaly.
Does anybody knows how to fix it ?
Regards, Andrew
Have Streaming replica with 2 hosts, master and slave with PostgreSQL 10.6. Replica works fine and data from master copying to slave without any problem,
but on slave have a problem: wal-files in catalog ${PGDATA}/pg_wal is not removing authomaticaly and may be collects over some years.
Both, master (1.1.1.1) and slave (2.2.2.2), have simmilar replication slots:
master=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
standby_slot | | physical | | | f | t | 1184 | | | 8/1C16C508 |
(1 row)
slave=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
---------------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
standby_slot | | physical | | | f | t | 1184 | | | 8/1C16C508 |
(1 row)
which seted in ${PGDATA}/recovery.done on master:
standby_mode = 'on'
primary_conninfo = 'user=pgadmin host=1.1.1.1 port=5432 user=pgadmin sslmode=prefer sslcompression=1'
recovery_target_timeline = 'latest'
trigger_file = 'failover'
restore_command = 'cp /opt/archivelog/%f %p'
archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /opt/archivelog %r'
primary_slot_name = 'standby_slot'
and seted in ${PGDATA}/recovery.conf on slave:
standby_mode = 'on'
primary_conninfo = 'user=pgadmin host=2.2.2.2 port=5432 user=pgadmin sslmode=prefer sslcompression=1'
recovery_target_timeline = 'latest'
trigger_file = 'failover'
restore_command = 'cp /opt/archivelog/%f %p'
archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /opt/archivelog %r'
primary_slot_name = 'standby_slot'
Both archive_cleanup_command's on both hosts works fine in console and no needs to set permissions on /opt/archivelog or /usr/pgsql-10/bin/pg_archivecleanup.
In ${PGDATA}/postgresql.conf on both hosts sets:
max_wal_size = 8GB
wal_keep_segments = 32
But yesterday pg_wal was 16G and had 1018 files inside, thats why there was no other way except deleting its manualy, but not of all,
just 6 hundreds files, which had modification time before September 2020 except 000000010000000000000034.00000028.backup.
Trying to rebuild slave from master with: making pg_start_backup, copying ${PGDATA} files from master to slave (without removing on slave any files in ${PGDATA}),
pg_stop_backup, tunning settings on slave and run it again had no any success, wal-files still is not removing automaticaly.
Does anybody knows how to fix it ?
Regards, Andrew