On 4/29/15 1:38 AM, Craig Ringer wrote:
Perhaps... different replication systems probably use different methods to identify, so presumably there'd need to be some way to map a generic identifier into an appropriate identifier for whatever replication system you're using. Replication identifiers do just that: provide a way to map identifiers from some external system into a local unique identifier for a peer node, along with tracking of the replay position from the peer so replay can be restarted at a consistent point. The replay position is an LSN, so they're not going to work for any arbitrary system, though.
Which may not work for something meant to work with different replication systems...
You'd want a way to define different sets and associate them with nodes. A node could be a provider, subscriber, or both. I think some replication systems support 'pass through' as well, where the node passes data downstream but doesn't apply it itself. Or it could be multi-master and possibly a provider to read-only subscribers. Yeah, you're talking about some kind of abstract modelling of a replication topology. I'm not sure that's at all necessary to keep track of which tables should be replicated to which nodes.
I'd think that you'd still need to know if a table is a provider or subscriber regardless of topology; how else will you know how to add it?
As for the topology part, yes, perhaps that's more than the baseline case. It might be enough of a win to just deal with tables and sets to not worry about it.
I originally had this idea when dealing with a number of londiste clusters and wishing I had something better than "Run this SELECT and paste the output to the command line" to deal with adding newly created tables. It seemed likely that a more generic system should also be pretty easy to allow plugging into different replication systems; there'd just need to be a different layer that translated definition into actual replication commands. Then the only thing missing would be defining what sets lived where; that would allow the generic system at least define almost every aspect of a replication environment. Maybe that's too ambitious; the first step would be to try just what tables are in which set and see how that goes.
-- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general