Greetings, * Kaixi Luo (kaixiluo@xxxxxxxxx) wrote: > We use PostgreSQL at work and we do daily backups with pg_dump. After that > we pg_restore the dump and check the database that there isn't any data > corruption. As the database grows, the whole pg_dump / pg_restore cycle > time is quickly approaching 24h, so we need to change strategies. I've found this to be a good strategy also, but it's far from perfect. Corruption can still occur, for example, in indexes on the primary system. Generally speaking, pg_dump doesn't exercise indexes and therefore you won't notice if an index is corrupt. > We've thought about using pg_basebackup + continuous archiving as an > alternative backup process, but I have doubts regarding the safety of such > procedure. As far as I know, pg_basebackup is done via rsync (and we also > archive wals using rsync), so if by any chance disk corruption occurs on > the master server, the corruption would be carried over to our backup > server. This is correct, but checksums are now available in modern versions of PG, which will detect disk corruption. Those checksums would be carried over to the backup server and could be verified there by using pg_dump (note that this still wouldn't help with indexes, but you don't have coverage there today anyway). > How can we check for backup corruption in this case? Thanks you very much. There has been some discussion about a specific tool for checking the checksums throughout the entire system. I don't know of anyone activly working on that, unfortunately. There are a number of tools available to help with online backups and continuous archiving beyond pgbasebackup and having to hand-roll scripts. I'm personally biased towards and prefer pgBackRest, as I helped start that project, but there are other tools, such as barman and WAL-E, which would still be better than trying to implement everything correctly on your own. Thanks! Stephen
Attachment:
signature.asc
Description: Digital signature