Re: Does PostgreSQL check database integrity at startup?

Edson Carlos Ericksson Richter <richter@xxxxxxxxxxxxxx> · Thu, 28 Dec 2017 15:46:43 -0200

Em 28/12/2017 10:16, Stephen Frost escreveu:
Alvaro,

* Alvaro Herrera (alvherre@xxxxxxxxxxxxxx) wrote:
For context: this was first reported in the Barman forum here:
https://groups.google.com/forum/#!msg/pgbarman/3aXWpaKWRFI/weUIZxspDAAJ
They are using Barman for the backups.
Ahhhh, I see.  I wasn't aware of that history.

Stephen Frost wrote:

But at some point in time, slave became corrupt (one of the base
files are zero size where it should be 16Mb in size), and IMHO a
"red alert" should arise - Slave server shall not even startup at
all.
How do you know it should be 16Mb in size...?  That sounds like you're
describing a WAL file, but you should be archiving your WAL files during
a backup, not just using whatever is in pg_xlog/pg_wal..
It's not a WAL file -- it's a file backing a table.
Interesting.

Since backups are taken from slave server, all backups are also corrupt.
If you aren't following the appropriate process to perform a backup
then, yes, you're going to end up with corrupt and useless/bad backups.
A few guys went over the backup-taking protocol upthread already.

But anyway the backup tool is a moot point.  The problem doesn't
originate in the backup -- it originates in the standby, from where the
backup is taken.  The file can be seen as size 0 in the standby.
Edson's question is: why wasn't the problem detected in the standby?
It seems a valid question to me, to which we currently we don't have any
good answer.
The last message on that thread seems pretty clear to me- the comment is
"I think this is a failure in standby build."  It's not clear what that
failure was but I agree it doesn't appear related to the backup tool
(the comment there is "I'm using rsync"), or, really, PostgreSQL at all
(a failure during the build of the replica isn't something we're
necessairly going to pick up on..).

As discussed on this thread, zero-byte files are entirely valid to
appear in the PostgreSQL data directory.

To try and dig into what happened, I'd probably look at what forks there
are of that relation, the entry in pg_class, and try to figure out how
it is that replication isn't complaining when the file on the primary
appeared to be modified well after the last modify timestamp on the
replica.  If it's possible to replica this into a test environment,
maybe even do a no-op update of a row of that table and see what happens
with replication.  One thing I wonder is if this table used to be
unlogged and it was later turned into a logged table but something
didn't quite happen correctly with that.  I'd also suggest looking for
other file size mismatches between the primary and the replica.

Thanks!

Stephen
The table was never unlogged. From very beginning, it was always logged. 
I've dozens of databases with exactly same setup - and right now, I'm 
rebuilding the slave server. Instead of investigating something probably 
I will not find the cause, I would like to have a alert for the future.

Would be possible to include in future versions:
1) After start standby, standby run all WAL files until it is 
synchronized with master (current behavior)
3) Before getting into "accept read only queries", check if all base 
files have same size as master server (new behavior). In case something 
is different, throw an error and stop database startup?
4) Then start "accept read only queries" (current behavior)
???

Thanks,

Edson