Re: Incremental backup

Mladen Gogala <gogala.mladen@xxxxxxxxx> · Thu, 28 Oct 2021 21:14:53 -0400



    On 10/28/21 18:07, Andreas Joseph Krogh
      wrote:

    
      I think everybody agrees that incremental backup per
          database, and not cluster-wide, is nice, and it would be
        nice if PG supported it. But, given the way PG is architectured,
        having cluster-wide WALs, that's not an easy task to implement.
      Repeating "other databases have it" doesn't change that.
      
        
          --

            Andreas Joseph Krogh
        
      
    I am not advocating for the database level incremental backups,
      but all databases that have it also have cluster wide/instance
      wide WAL logs. Cluster wide WAL logs do not make database level
      incremental backups hard. Both Oracle and DB2 have database level
      incremental backups and both have cluster wide WAL (redo or logs).
      So does SQL*Server which is a bit specific because its log is used
      for both recovery, as WAL, and undo so its very different from
      Postgres. Also, SQL Server is, to my knowledge, the only database
      which doesn't use MVCC.

    
    The technology that all of those databases employ is a bitmap
      device which has one bit per each database block. Full backups set
      all bits to 0 and whenever block is modified, the corresponding
      bit is set to 1. The backup tool in the incremental mode then only
      copies blocks with the bitmap value of 1. I am not too thrilled by
      that implementation. In particular, with Oracle there were bugs
      with database restore, "duplicate database" operation, global
      cache locks and instance latches. I've had quite a few headaches
      with RMAN cumulative incremental backups. My preferred method of
      backup is storage snapshot. Snapshots then can be  backed up to
      other arrays (NetApp SnapVault, Hitachi HUR, EMC SRDF)  or can be
      backed up to deduplicated offline storage like AWS Glacier or EMC
      Data Domain using simple file level utilities. Once snapshot is
      taken, it is a read-only file system and the files are no longer
      opened by the database processes. 

    
    The classic file level backup tools like pg_basebackup or rman
      are pretty much unusable once your database hits 50 TB or so. With
      50TB database, your RTO (Recovery Time Objective) will be in days.
      You can only push around 3TB/hour down a 10Gbit Ethernet pipe. On
      the other hand companies like Facebook. Amazon, Walmart or Target
      are losing hundreds of thousands USD per hour of downtime.
      Downtime of 15 hours or longer is completely unacceptable. Backup
      is only the last line of defense. It should only be used if
      everything else fails. And if you have to use file level tool like
      pg_basebackup and tar to restore your VLDB, you should also get
      your resume ready.
    Incremental backups are awkward and ungainly. You have to restore
      the full backup and all incrementals taken since the last full.
      After that, you must apply the remaining WAL files. So, by
      definition, restoring incremental backup is, by definition, longer
      than restoring a full backup. So, if you have to take a backup,
      get yourself a deduplicating backup utility. Deduplication will
      only save the backup blocks that are different from the previous
      backup. In other words, you can do a daily full backup with the
      space expenditure of an incremental backup. Also, if you need
      backups because of regulatory reasons (HIPAA, SOX), incremental
      backups will not do. 

    
    Long story short, there are much more important things to do on
      Postgres than incremental database level backups. BTW, Chuck
      Norris can take and incremental database level backup of a
      Postgres database and recover database without restoring the
      preceding full backup. I am not sure whether Chuck Norris is a
      member of Postgres steering committee.

    
    -- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com