Re: checkpoints taking much longer than expected

Stephen Frost <sfrost@xxxxxxxxxxx> · Sun, 16 Jun 2019 17:16:04 -0400

Greetings,

* Tiemen Ruiten (t.ruiten@xxxxxxxxxxx) wrote:
> On Sun, Jun 16, 2019 at 7:30 PM Stephen Frost <sfrost@xxxxxxxxxxx> wrote:
> > Ok, so you want fewer checkpoints because you expect to failover to a
> > replica rather than recover the primary on a failure.  If you're doing
> > synchronous replication, then that certainly makes sense.  If you
> > aren't, then you're deciding that you're alright with losing some number
> > of writes by failing over rather than recovering the primary, which can
> > also be acceptable but it's certainly much more questionable.
> 
> Yes, in our setup that's the case: a few lost transactions will have a
> negligible impact to the business.

There's a risk here that the impact could be much higher than just 'a
few'.  Hopefully you're monitoring it carefully and have some plan of
what to do if the difference grows very quickly.

> > Alternatively, having a way to more easily make
> > the primary to accepting new writes, flush everything to the replicas,
> > report that it's completed doing so, to allow you to promote a replica
> > without losing anything, and *then* go through the process on the
> > primary of doing a checkpoint, would be kind of nice.
> 
> I suppose that would require being able to demote a master to a slave
> during runtime.

That's.. not the case.

If you're really not worried about losing transactions, then an
immediate shutdown as suggested by Peter G would let you fail over to
one of your not-completely-current replicas..  Or you could just do that
as soon as you start shutting down the primary.  Of course, you'll end
up in a situation where the replica will have fork'ed off of the
timeline from a point prior to where the primary is at, meaning you'd
have to use pg_rewind (or do a restore from an earlier backup) to
re-build the former-primary as a replica, if that's the goal.

If the idea is that you don't want to lose any transactions during this
process, and you want to be able to cleanly bring the former primary
back, then I don't think we've really got a great solution where you can
make sure that all the WAL has been flushed out to the replicas and the
primary just finishes its checkpoint (and thinking about that a bit
more, I'm not sure we could actually do it anyway since we're going to
want to write out that shutdown record, making it a catch-22, but maybe
there's something clever that could be done there..).

Thanks,

Stephen
Attachment:
signature.asc

Description: PGP signature