Hi friends,
I am running 2 Linux machines, kernel 3.13.0-45-generic #74-Ubuntu SMP.
Postgresql version 9.4 in both machine, in a Hot Standby cenario.
Master-Slave using WAL files, not streaming replication.
The archive_command from master is:
archive_command = '/usr/bin/rsync -a -e "ssh" "%p" slave:/data2/postgres/standby/main/incoming/"%f"' #
archive_command = '/usr/bin/rsync -a -e "ssh" "%p" slave:/data2/postgres/standby/main/incoming/"%f"' #
The recovery.conf from slave is:
standby_mode = 'on'
restore_command = 'cp /data2/postgres/standby/main/incoming/%f "%p"'
standby_mode = 'on'
restore_command = 'cp /data2/postgres/standby/main/incoming/%f "%p"'
We have a have intensive write operation generating for example 1577 wals segments per hour ~= 26 segments per minute.
The slave is very behind from master, more than 20 hours.
I can see that all WAL segments on master are on ready state, waiting for archive_command do his jobs.
The slave is waiting for the wal files as described above.
016-11-02 18:57:48 UTC::@:[15698]: LOG: unexpected pageaddr C955/C5000000 in log segment 000000010000C96000000023, offset 0
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000022" from archive
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000023" from archive
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000024" from archive
cp: cannot stat ‘/data2/postgres/standby/main/incoming/000000010000C96000000025’: No such file or directory
2016-11-02 18:57:54 UTC::@:[15698]: LOG: unexpected pageaddr C956/71000000 in log segment 000000010000C96000000025, offset 0
2016-11-02 18:57:58 UTC::@:[15698]: LOG: restored log file "000000010000C96000000024" from archive
cp: cannot stat ‘/data2/postgres/standby/main/incoming/000000010000C96000000025’: No such file or directory
The slave is waiting for the wal files as described above.
016-11-02 18:57:48 UTC::@:[15698]: LOG: unexpected pageaddr C955/C5000000 in log segment 000000010000C96000000023, offset 0
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000022" from archive
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000023" from archive
2016-11-02 18:57:54 UTC::@:[15698]: LOG: restored log file "000000010000C96000000024" from archive
cp: cannot stat ‘/data2/postgres/standby/main/incoming/000000010000C96000000025’: No such file or directory
2016-11-02 18:57:54 UTC::@:[15698]: LOG: unexpected pageaddr C956/71000000 in log segment 000000010000C96000000025, offset 0
2016-11-02 18:57:58 UTC::@:[15698]: LOG: restored log file "000000010000C96000000024" from archive
cp: cannot stat ‘/data2/postgres/standby/main/incoming/000000010000C96000000025’: No such file or directory
It seems that archive_command is very slowly compared with the amount of WAL segments generated.
Any suggestions??? Should I use another strategy to increase the archive_command process speed???
Best Regards,