You could use slony for this a couple of ways. One is a simpler more
hacky way, another is a bit more involved but perhaps more "correct":
For the simple way, setup the source database as a provider and the
remote replica as a normal slony subscriber. Don't run the slon
daemons all the time, just launch them and let them run every night
to catch up the replica. Newer slony versions allow you to run slon
in a mode where it terminates after a certain point. This is simple,
but has a couple of drawbacks:
- The data will become bloated in the source during the day, because
each insert/update to a replicated table will create log data in
ancillary slony tables that won't get cleaned out until you replicate
at night. This is probably a pretty minor issue though unless the
data it huge.
- The replication process will put appreciable load on the source
database. And as the log tables grow it takes longer and longer for
the replica to catch up.
Another approach is perhaps more "standard" though more complex:
- Setup a slony provider and subscriber both on the local network
(you need at least one secondary to use log shipping)
- Setup a remote secondary as a "log shipping" node
- Run slons for the local provider and subscriber with "log shipping"
mode turned on. This will generate sql files that contain the data
updates made to the provider.
- Periodically tar up the files, ship them over to the remote
secondary and apply them (using psql or some custom script).
An advantage to the latter is that you can compress the log shipping
files. They are not very compact (since they contain sql statements
and your data), but they should compress quite well.
hth,
-Casey
On Jan 10, 2007, at 11:02 AM, Gene wrote:
I searched the web sites for advice but came up short on something
that looked like what I need. I need to "replicate" a few tables
nightly (2am-6am). Each table will have about 1 million new records
(~100B each, ~5/second) each day.
I have tried using something like a cron job (copy from/copy to) but
it has limitiations as far as handling duplicate keys (just need to
replace) and it will not fill in gaps if there happen to be any. the
servers are also far from each other and only have a 50 KB/s ipsec
link between them so data transfer needs to be somewhat minimized.
thanks for any suggestions,
gene
---------------------------(end of
broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org/