Re: Bad recovery: no pg_xlog/RECOVERYXLOG

Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> · Sat, 4 Nov 2017 11:00:03 +1300

Stephen,

On 03/11/17 00:11, Stephen Frost wrote:

Sure, that'll work much of the time, but that's about like saying that
PG could run without fsync being enabled much of the time and everything
will be ok.  Both are accurate, but hopefully you'll agree that PG
really should always be run with fsync enabled.

It is completely different - this is a 'straw man' argument, and justs 
serves to confuse this discussion.

Also, if what you are suggesting were actually the case, almost
everyone's streaming replication (and/or log shipping) would be
broken all the time.
No, again, this isn't an argument about if it'll work most of the time
or not, it's about if it's correct.  PG without fsync will work most of
the time too, but that doesn't mean it's actually correct.

No, it is pointing out that if your argument were correct, then there 
should be the above side effects - there are not, which is significant.

The crux of your argument seems to be concerning the synchronization 
between pg_basbackup finishing and being sure you have the required 
archive logs. Now just so we are all clear, when pg_basebackup ends it 
essentially calls do_pg_stop_backup (from xlog.c) which ensures that all 
required WAL files are archived, or to be precise here makes sure 
archive_command has been run successfully for each required WAL file.

Your entire argument seems about whether said WAL is fsync'ed to disk, 
and how this is impossible to ensure in a shell script. Actually it is 
possible quite simply: e.g suppose you archive command is:

rsync ... targetserver:/disk

There are several ways to get that to sync:

rsync .. targetserver:/disk && ssh target server sync

Alternatively amend  vm.dirty_bytes on targetserver to be < 16M, or 
mount the /disk with sync option!

So it is clearly *possible*.

However, I think you are obsessing over the minutiae of fsync to single 
server/disk when there are much more important (read likely to happen) 
problems to consider. For me, the critical consideration is, not 'are 
the WAL files there *right now*'..but 'will they be there tomorrow when 
I need them for a restore'? Next is 'will they be the same/undamaged 
when I read them tomorrow'?

This is why I'm *not* obsessing about fsyncing...make where you store 
these WAL files *reliable*...either via proxying/ip splitting so you 
send stuff to more that one server (if we are still thinking server + 
disk = backup solution). Alternatively use a distributed object store 
(Swift, S3 etc) that handle that for you, and in addition they checksum 
and heal any individual node data corruption for you as well.
With respect to 'If I would like to develop etc etc..' - err, all I
was doing in this thread was helping the original poster make his
stuff a bit better - I'll continue to do that.
Ignoring the basic requirements which I outlined isn't helping him get
to a reliable backup system.

Actually I was helping him get a *reliable* backup system, I think you 
misunderstood how swift changes the picture compared to a single 
server/single disk design.

regards

Mark

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin