Re: Howto: Using PITR recovery for standby replication

"Benjamin Krajmalnik" <kraj@xxxxxxxxxxx> · Thu, 20 Apr 2006 21:52:30 -0600

Title: Re: [ADMIN] Howto: Using PITR recovery for standby replication 

Tom,

First of all forgive me if I am totally 
incorrect - I may very well be:)  If so, believe me I will be a very happy 
camper since my concerns will be void.  My concern was raised when I backed 
up the server which was receiving production data, and I restored it in a 
developmen server.  The difference between both of them is tht the 
production server has a very high row insertion rate, while the development 
server has about 10 rows per minute inserted (just to enable us to check tht our 
real time aggregation code and graphical display routines are working 
properly).

After restoring, when we fired up the 
service responsible for record insertion, I began receiving the constraint 
violations on the columns controlled by the sequences.  The table had 
higher values in them than the sequences.  This raised a huge red flag for 
me.  My concern was that the aggreegated data tables may not reflect the 
data in the raw inserted tables - essentially, that they may be out of 
sync.

The particular table which was 
problematic (and for which I posted another message due to the unique 
constraint violation which I am seeing intermittently) is the one with 
the high insertion rate.  The sequence is currently being used to 
facilitate purginf of old records.  However, as I study and play more 
with PostgreSQL, I found the ability to partition a table.  Once I 
move to table partitioning, my problem of ourgin data past retention periods 
will be fixed.

My entire conecpt may have been 
incorrect and is based with my experiences with MS SQL Server whereby when 
I purged records based on the date, due to the large amounts of data huge 
transaction logs were created, and in some cases ended up using so much 
diskspace that the database imploded!  The workaroound which I created 
under SQL Server was to assign an identity field to each row, select the minimum 
value for the day to be purged, and then purge records 10,000 at a 
time within transactions.  This kept the transaction file small 
and the database from exploding dye to running out of disk 
space.

It is very possible that this may not have 
been an issue with PostgreSQL, but I could not take a chance, so I ported the 
methodology over.  The new architecture will have a table partition 
for each month (12 partitions).  Once the retention period of the given 
partition expires it will simply be truncated.

Sorry for the rambling, but, if I 
understand correctly from you, the only items which were out of synch were 
the sequences, but all of the tables would have maintained consistency relative 
to each other?  If so, once I get rid of the unnecessary sequences, I can 
create a small function to be run after a restore which can reset the 
sequences to the proper value.   That would be simple enough, and 
would provide an easily implemented solution.

You'll probably see me in here asking lots 
of questins as I cut my teeth on PostgreSQL.  Hopefully, at some point in 
the future I will be able to contribute back with solutions :)

Once again, thank you.   Also, 
did you receive the snippet of the stored procedure which I sent you?  As I 
mentioned, the only place where row insertion is performed is via that stored 
procedure, and the sequences were created by defining the columns 
as "bigserial", which still has me puzzled as to why I am experiencing 
the contraing violation on the unique primary key.

Regards,

Benjamin 

From: Tom Lane 
[mailto:tgl@xxxxxxxxxxxxx]
Sent: Thu 4/20/2006 9:09 PM
To: 
Benjamin Krajmalnik
Cc: pgsql-admin@xxxxxxxxxxxxxx
Subject: 
Re: [ADMIN] Howto: Using PITR recovery for standby replication 

"Benjamin Krajmalnik" <kraj@xxxxxxxxxxx> writes:
> I 
have tried using pg_dump, but discovered that the backup was not a =
> 
consistent backup.

Really?

> Back to the problem I faced when 
testing backups with pg_dump, it =
> appears that the backup was not a 
consistent backup of the data.  For ="">> example, sequences which are 
used by some tables bo longer held the =
> correct values (the tables now 
held higher values),

Sequences are non-transactional, so pg_dump might 
well capture a higher
value of the sequence counter than is reflected in any 
table row, but
there are numerous other ways by which a gap can appear in the 
set of
sequence values.  That's not a bug.  If you've got real 
discrepancies
in pg_dump's output, a lot of us would like to know about 
'em.

        regards, tom 
lane