Hi I have reverted to cp as archive command, but know under heavy load (> 150 WAL segments in a minute) it happens that some wal segments gets corrupted:
postgres@lemur:~/9.1/main/pg_xlog$ md5sum 000000010000001000000049
f1906d2745224430f811496df466203f 000000010000001000000049
postgres@lemur:~/9.1/main/pg_xlog$ md5sum ~/backups/wal/000000010000001000000049
7e73fe759e41e427497360a815f9d3e1 /var/lib/postgresql/backups/wal/000000010000001000000049
On Fri, Apr 26, 2013 at 10:55 AM, Albe Laurenz <laurenz.albe@xxxxxxxxxx> wrote:
German Becker wrote:So the problem might be in that script.
> Here is the archive part of the config:
>
> archive_mode = on # allows archiving to be done
> # (change requires restart)
> archive_command = '/var/lib/postgresql/scripts/archive_copy.sh %p %f' # command to use to
> archive a logfile segment
> #archive_timeout = 0 # force a logfile segment switch after this
> # number of seconds; 0 disables
archive_command should not retry the operation, but rather
> The archive coommand makes a local copy and then it copies to the backup server via ssh. Both copies
> are md5-checked and retried up to 3 times in case of failure.
return a non-zero return code.
See http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL
> I have seen under heavy load that some WALs are skipped, some have less size, some are corrupted (i,e,
> the loop fails 3 times).
> I'm not sure about the return value (checking it). What is the expected behaviour of the archiver?
> Will it retry de archive if archive command returns differnt than 0? Will it retain the WAL segment
> until it is succesfuly archived?
archive_command should exit with zero only if the
WAL segment was archived successfully.
PostgreSQL will retry and retain the WAL segment until
archival succeeds.
Yours,
Laurenz Albe