Re: BDR Selective Replication

Jim Nasby <Jim.Nasby@xxxxxxxxxxxxxx> · Wed, 29 Apr 2015 17:40:44 -0500

On 4/29/15 1:38 AM, Craig Ringer wrote:
    Perhaps... different replication systems probably use different
    methods to identify, so presumably there'd need to be some way to
    map a generic identifier into an appropriate identifier for whatever
    replication system you're using.

Replication identifiers do just that: provide a way to map identifiers
from some external system into a local unique identifier for a peer
node, along with tracking of the replay position from the peer so replay
can be restarted at a consistent point. The replay position is an LSN,
so they're not going to work for any arbitrary system, though.

Which may not work for something meant to work with different 
replication systems...

    You'd want a way to define different sets and associate them with
    nodes. A node could be a provider, subscriber, or both. I think some
    replication systems support 'pass through' as well, where the node
    passes data downstream but doesn't apply it itself. Or it could be
    multi-master and possibly a provider to read-only subscribers.

Yeah, you're talking about some kind of abstract modelling of a
replication topology. I'm not sure that's at all necessary to keep track
of which tables should be replicated to which nodes.

I'd think that you'd still need to know if a table is a provider or 
subscriber regardless of topology; how else will you know how to add it?

As for the topology part, yes, perhaps that's more than the baseline 
case. It might be enough of a win to just deal with tables and sets to 
not worry about it.

I originally had this idea when dealing with a number of londiste 
clusters and wishing I had something better than "Run this SELECT and 
paste the output to the command line" to deal with adding newly created 
tables. It seemed likely that a more generic system should also be 
pretty easy to allow plugging into different replication systems; 
there'd just need to be a different layer that translated definition 
into actual replication commands. Then the only thing missing would be 
defining what sets lived where; that would allow the generic system at 
least define almost every aspect of a replication environment. Maybe 
that's too ambitious; the first step would be to try just what tables 
are in which set and see how that goes.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general