Re: Something else about Redo Logs disappearing

Magnus Hagander <magnus@xxxxxxxxxxxx> · Thu, 11 Jun 2020 22:35:13 +0200

On Thu, Jun 11, 2020 at 10:13 PM Peter <pmc@xxxxxxxxxxxxxxxxxxxxxxx> wrote:

Okay. So lets behave like professional people and figure how that

can be achieved:

At first, we drop that WAL requirement, because with WAL archiving

it is already guaranteed that an unbroken chain of WAL is always

present in the backup (except when we have a bug like the one that

lead to this discussion).

So this is **not part of the scope**.

I would assume that anybody who deals with backups professionally wouldn't consider that out of scope, but sure, for the sake of argument, let's do that.

! This is only one option though, there are others- you can also use

! pgbackrest to push your backups to s3 (or any s3-compatible data storage

! system, which includes some backup systems), and we'll be adding

! support

! I concur that this is becoming a madhouse, and is pushing past the limit

! for what I'm willing to deal with when trying to assist someone.

Well, then that might be a misconception. I'm traditionally a

consultant, and so I am used to *evaluate* solutions. I don't need

assistance for that, I only need precise technical info.

Excellent. Then let's stick to that.

This STILL needs threaded programming (as I said, there is no way to

avoid that with those "new API"), but in this case it is effectively

reduced to just grab the return-code of some program that has been

started with "&".

There is *absolutely* no need for threading to use the current APIs. You need to run one query, go do something else, and then run another query. It's 100% sequential, so there is zero need for threads. Now, if you're stuck in shellscript, it's a little more complicated. But it does not need threading.

But then, lets think another step forward: for what purpose do we

actually need to call pg_start_backup() and pg_stop_backup() at all?

I couldn't find exhaustive information about that, only some partial

facts.

Since you don't trust the documentation, I suggest you take a look at https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/xlog.c;h=55cac186dc71fcc2f4628f9974b30850bb51eb5d;hb=92c58fd94801dd5c81ee20e26c5bb71ad64552a8#l10438

It has a fair amount of detail of the underlying reasons, and of course links to all the details.

Things that remain to be figured out:

 1. What does pg_start_backup actually do and why would that be

    necessary? I could not find exhaustive information, but this can

    probably figured from the source. Currently I know so much:

     - it writes a backup_label file. That is just a few lines of

       ASCII and should not be difficult to produce.

It does that only in exclusive mode, and doing that is one of the big problems with exclusive mode. So don't do that.

I now hope very much that Magnus Hagander will tell some of the

impeding "failure scenarios", because I am getting increasingly

tired of pondering about probable ones, and searching the old

list entries for them, without finding something substantial.

Feel free to look at the mailinglist archives. Many of them have been explained there before. Pay particular attention to the threads around when the deprecated APIs were actually deprecaed. I believe somebody around that time also wrote a set of bash scripts that can be used in a pre/post-backup-job combination with the current APIs.

//Magnus