Search Postgresql Archives

Re: incremental backups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Unfortunately, I think I understand your question.  :-)

These TODO items are what you need:

	* Point-In-Time Recovery (PITR)

          o Allow point-in-time recovery to archive partially filled
            write-ahead logs [pitr]

            Currently only full WAL files are archived. This means that the
            most recent transactions aren't available for recovery in case
            of a disk failure. This could be triggered by a user command or
            a timer.

          o Automatically force archiving of partially-filled WAL files when
            pg_stop_backup() is called or the server is stopped

            Doing this will allow administrators to know more easily when
            the archive contains all the files needed for point-in-time
            recovery.

I will try to push to have them done for 8.2.

---------------------------------------------------------------------------

Rick Gigger wrote:
> I guess my email wasn't all that clear.  I will try to rephrase.  I  
> am moving from using the old style pg_dump for backups to using  
> incrementals and want to make sure I understand the process before I  
> go about writing a bunch of scritps.
> 
> To me setting up incremental backup consists of the following  
> components:
> 
> 1) Setting up the WAL archiving.  This one is trivial.
> 2) Doing physical dumps of the $PGDATA directory.  This one is once  
> again trivial.
> 3) Knowing which physical dumps are Good and Not Good.  For a given  
> physical dump D there is are WAL archive files Dstart and Dend for  
> which you much have Dstart and Dend and all files in between.  If you  
> have all those files then the physical dump is Good.  If you don't  
> have them then the dump is worthless to you.
> 4) Knowing which dumps and which archive files can be deleted.  This  
> depends on a number of factors.
> 	a) How far back do you want to be able to do PITR
> 	b) How much space do you have / want to use for PITR
> 	c) Which physical dumps are Good and which are Not Good. (see #3)
> 
> Now I think I have a pretty good plan here except for #3 (and so #4  
> then also suffers).
> 
> Just as an example lets say I'm not concerned so much with PITR as I  
> am recovering from a db crash. I've got all the backups files saved  
> to my backup db server so I can failover to it if my primary db  
> server dies.  I just want to make sure I've got one physical dump  
> that is good.  (This is not my actual situation but it illustrated my  
> point better.)
> 
> Now when I do a physical dump it is not a Good dump.  That is I don't  
> have the end archive file necessary to recover from that physical  
> dump.  That is to say that  when I call pg_backup_start() then copy  
> $PGDATA then call pg_backup_stop() postgres might be on say WAL  
> archive file #5.  Once the physical dump is completed WAL archive  
> file #5 hasn't been archived yet.  I only have up to #4.  So if I  
> delete my old physical dumps and all I've got is this most recent one  
> and my database crashes before #5 gets archived then I am hosed.  I  
> have no good physical backups to start from.
> 
> My main question is about the best way to figure out when a physical  
> dump is Good.
> 
> One strategy is to always keep around lots of physical dumps.  If you  
> keep around 100 dumps you can be pretty sure that in the space of  
> time that those physical dumps take place that at least one WAL file  
> was archived.  In fact if you keep 2 physical dumps you can be fairly  
> certain of this.  If not then you really need to space our your dumps  
> more.
> 
> Is this making sense at this point?
> 
> The problem is that the WAL archiving is triggered by postgres and  
> the rate at which the db is updated.  The physical dumps are  
> triggered by cron and on a purely time based schedule.  So in theory  
> if you had the physical dumps happening once a day but for some odd  
> reason no one updated the database for 4 days then all of a sudden  
> you'd have 2 physical backups and neither of them are good.  If  
> you're db crashes during that time you are hosed.
> 
> Maybe I am arguing a point that is just stupid because this will  
> never happen in real life.  But in that it is my backups system that  
> I will be using to recover from complete and total disaster I just  
> want to have all my bases covered.
> 
> So my ideas on how to determine if a physical dump is Good are as  
> follows.
> 
> 1) When you do the physical backup (after dumping the $PGDATA dir but  
> before calling pg_stop_backup() ) determine the current WAL archive  
> file.  Mark somewhere in the backed up physical dump the last file  
> needed for the dump to be considered good.  Then your deletion  
> scripts can look at the WAL archive files you have and the last one  
> required for the dump to be Good and determine if the dump is Good or  
> not.
> 
> 2) After doing the physical dump but before calling pg_stop_backup()  
> just copy the current WAL file to the physical dump.  If that file  
> later gets archived then the restore commands overwrites your  
> partially completed one so it doesn't hurt but you know that when you  
> call pg_stop_backup() that that physical dump is good.  (Is it ok to  
> copy the current WAL file while it is still in use?)
> 
> Is anyone taking one of these or any other precautions to make sure  
> they've got a good physical dump or does everyone just keep a whole  
> bunch of dumps around, and then actually restore the dump to see if  
> it is good and if not go back to a previous dump?
> 
> I hope that makes more sense.
> 
> Thanks,
> 
> Rick
> 
> On Jan 27, 2006, at 3:33 AM, Richard Huxton wrote:
> 
> > Rick Gigger wrote:
> >> Um, no you didn't read my email at all.  I am aware of all of that  
> >> and it is clearly outlined in the docs.  My email was about a  
> >> specific detail in the process.  Please read it if you want to  
> >> know what my actual question was.
> >
> > I'm not sure your email is quite right as regards the process. You  
> > need:
> >   1. the filesystem backup
> >   2. the WAL file indicated in the history-file
> >   3. all the WAL files later than that
> > to get up to "now".
> >
> > If you don't want to replay up to "now" then you will not need some  
> > of the more recent WAL files. You can't afford to throw them away  
> > though since you've got a rolling backup system running and the  
> > whole point is so you can recover to any point you like.
> >
> > You can however throw away any WAL files older than that indicated  
> > in the history file for your current filesystem-backup. You can  
> > then only restore from that point in time forward.
> >
> > There is no "last one" in the WAL set unless you know the time you  
> > want to restore to. Indeed, the "last one" might not be "full" yet  
> > and therefore archived if you want to restore to 10 seconds ago.
> >
> > Or am I mis-understanding your email too?
> >
> > -- 
> >   Richard Huxton
> >   Archonet Ltd
> >
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@xxxxxxxxxxxxxxxx               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux