Re: incremental backups

Bruce Momjian <pgman@xxxxxxxxxxxxxxxx> · Tue, 31 Jan 2006 00:20:38 -0500 (EST)

Yes, I think copying it while it is being written is safe.

---------------------------------------------------------------------------

Rick Gigger wrote:
> Yes!  Thanks you!  That is exactly what I was looking for.
> 
> So I take it that this means that it is save to copy the current in  
> use WAL file even as it is being written to?
> And it also means that if I copy it with my physical file system  
> backup then I should have the last file that I need to restore from  
> that physical backup?
> 
> So if I write my own backup_latest_WAL_file.sh script (I think I  
> found one on the list from Simon Riggs) then I can do what I need to  
> do before those todo items get done?  Or will I need to wait till  
> postgres gives me the ability to safely copy the file?
> 
> 
> 
> On Jan 30, 2006, at 11:13 AM, Bruce Momjian wrote:
> 
> >
> > Unfortunately, I think I understand your question.  :-)
> >
> > These TODO items are what you need:
> >
> > 	* Point-In-Time Recovery (PITR)
> >
> >           o Allow point-in-time recovery to archive partially filled
> >             write-ahead logs [pitr]
> >
> >             Currently only full WAL files are archived. This means  
> > that the
> >             most recent transactions aren't available for recovery  
> > in case
> >             of a disk failure. This could be triggered by a user  
> > command or
> >             a timer.
> >
> >           o Automatically force archiving of partially-filled WAL  
> > files when
> >             pg_stop_backup() is called or the server is stopped
> >
> >             Doing this will allow administrators to know more  
> > easily when
> >             the archive contains all the files needed for point-in- 
> > time
> >             recovery.
> >
> > I will try to push to have them done for 8.2.
> >
> > ---------------------------------------------------------------------- 
> > -----
> >
> > Rick Gigger wrote:
> >> I guess my email wasn't all that clear.  I will try to rephrase.  I
> >> am moving from using the old style pg_dump for backups to using
> >> incrementals and want to make sure I understand the process before I
> >> go about writing a bunch of scritps.
> >>
> >> To me setting up incremental backup consists of the following
> >> components:
> >>
> >> 1) Setting up the WAL archiving.  This one is trivial.
> >> 2) Doing physical dumps of the $PGDATA directory.  This one is once
> >> again trivial.
> >> 3) Knowing which physical dumps are Good and Not Good.  For a given
> >> physical dump D there is are WAL archive files Dstart and Dend for
> >> which you much have Dstart and Dend and all files in between.  If you
> >> have all those files then the physical dump is Good.  If you don't
> >> have them then the dump is worthless to you.
> >> 4) Knowing which dumps and which archive files can be deleted.  This
> >> depends on a number of factors.
> >> 	a) How far back do you want to be able to do PITR
> >> 	b) How much space do you have / want to use for PITR
> >> 	c) Which physical dumps are Good and which are Not Good. (see #3)
> >>
> >> Now I think I have a pretty good plan here except for #3 (and so #4
> >> then also suffers).
> >>
> >> Just as an example lets say I'm not concerned so much with PITR as I
> >> am recovering from a db crash. I've got all the backups files saved
> >> to my backup db server so I can failover to it if my primary db
> >> server dies.  I just want to make sure I've got one physical dump
> >> that is good.  (This is not my actual situation but it illustrated my
> >> point better.)
> >>
> >> Now when I do a physical dump it is not a Good dump.  That is I don't
> >> have the end archive file necessary to recover from that physical
> >> dump.  That is to say that  when I call pg_backup_start() then copy
> >> $PGDATA then call pg_backup_stop() postgres might be on say WAL
> >> archive file #5.  Once the physical dump is completed WAL archive
> >> file #5 hasn't been archived yet.  I only have up to #4.  So if I
> >> delete my old physical dumps and all I've got is this most recent one
> >> and my database crashes before #5 gets archived then I am hosed.  I
> >> have no good physical backups to start from.
> >>
> >> My main question is about the best way to figure out when a physical
> >> dump is Good.
> >>
> >> One strategy is to always keep around lots of physical dumps.  If you
> >> keep around 100 dumps you can be pretty sure that in the space of
> >> time that those physical dumps take place that at least one WAL file
> >> was archived.  In fact if you keep 2 physical dumps you can be fairly
> >> certain of this.  If not then you really need to space our your dumps
> >> more.
> >>
> >> Is this making sense at this point?
> >>
> >> The problem is that the WAL archiving is triggered by postgres and
> >> the rate at which the db is updated.  The physical dumps are
> >> triggered by cron and on a purely time based schedule.  So in theory
> >> if you had the physical dumps happening once a day but for some odd
> >> reason no one updated the database for 4 days then all of a sudden
> >> you'd have 2 physical backups and neither of them are good.  If
> >> you're db crashes during that time you are hosed.
> >>
> >> Maybe I am arguing a point that is just stupid because this will
> >> never happen in real life.  But in that it is my backups system that
> >> I will be using to recover from complete and total disaster I just
> >> want to have all my bases covered.
> >>
> >> So my ideas on how to determine if a physical dump is Good are as
> >> follows.
> >>
> >> 1) When you do the physical backup (after dumping the $PGDATA dir but
> >> before calling pg_stop_backup() ) determine the current WAL archive
> >> file.  Mark somewhere in the backed up physical dump the last file
> >> needed for the dump to be considered good.  Then your deletion
> >> scripts can look at the WAL archive files you have and the last one
> >> required for the dump to be Good and determine if the dump is Good or
> >> not.
> >>
> >> 2) After doing the physical dump but before calling pg_stop_backup()
> >> just copy the current WAL file to the physical dump.  If that file
> >> later gets archived then the restore commands overwrites your
> >> partially completed one so it doesn't hurt but you know that when you
> >> call pg_stop_backup() that that physical dump is good.  (Is it ok to
> >> copy the current WAL file while it is still in use?)
> >>
> >> Is anyone taking one of these or any other precautions to make sure
> >> they've got a good physical dump or does everyone just keep a whole
> >> bunch of dumps around, and then actually restore the dump to see if
> >> it is good and if not go back to a previous dump?
> >>
> >> I hope that makes more sense.
> >>
> >> Thanks,
> >>
> >> Rick
> >>
> >> On Jan 27, 2006, at 3:33 AM, Richard Huxton wrote:
> >>
> >>> Rick Gigger wrote:
> >>>> Um, no you didn't read my email at all.  I am aware of all of that
> >>>> and it is clearly outlined in the docs.  My email was about a
> >>>> specific detail in the process.  Please read it if you want to
> >>>> know what my actual question was.
> >>>
> >>> I'm not sure your email is quite right as regards the process. You
> >>> need:
> >>>   1. the filesystem backup
> >>>   2. the WAL file indicated in the history-file
> >>>   3. all the WAL files later than that
> >>> to get up to "now".
> >>>
> >>> If you don't want to replay up to "now" then you will not need some
> >>> of the more recent WAL files. You can't afford to throw them away
> >>> though since you've got a rolling backup system running and the
> >>> whole point is so you can recover to any point you like.
> >>>
> >>> You can however throw away any WAL files older than that indicated
> >>> in the history file for your current filesystem-backup. You can
> >>> then only restore from that point in time forward.
> >>>
> >>> There is no "last one" in the WAL set unless you know the time you
> >>> want to restore to. Indeed, the "last one" might not be "full" yet
> >>> and therefore archived if you want to restore to 10 seconds ago.
> >>>
> >>> Or am I mis-understanding your email too?
> >>>
> >>> -- 
> >>>   Richard Huxton
> >>>   Archonet Ltd
> >>>
> >>
> >>
> >> ---------------------------(end of  
> >> broadcast)---------------------------
> >> TIP 4: Have you searched our list archives?
> >>
> >>                http://archives.postgresql.org
> >>
> >
> > -- 
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@xxxxxxxxxxxxxxxx               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square,  
> > Pennsylvania 19073
> >
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@xxxxxxxxxxxxxxxx               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073