On 23/03/15 03:11, Andrew Beekhof wrote: > >> On 19 Mar 2015, at 9:05 pm, Christine Caulfield <ccaulfie@xxxxxxxxxx> wrote: >> >> Extending corosync >> ------------------ >> >> This is an idea that came out of several discussions at the cluster >> summit in February. Please comment ! >> >> It is not meant to be a generalised solution to extending corosync for >> most users. For single & double digit cluster sizes the current ring >> protocols should be sufficient. This is intended to make corosync usable >> over much larger node counts. >> >> The problem >> ----------- >> Corosync doesn't scale well to large numbers of nodes (60-100 to 1000s) >> This is mainly down to the requirements of virtual synchrony(VS) and the >> ring protocol. >> >> A proposed solution >> ------------------- >> Have 'satellite' nodes that are not part of the ring (and do not not >> participate in VS). >> They communicate via a single 'host' node over (possibly) TCP. The host >> sends the messages >> to them in a 'send and forget' system - though TCP guaratees ordering >> and delivery. >> Host nodes can support many satellites. If a host goes down the >> satellites can reconnect to >> another node and carry on. >> >> Satellites have no votes, and do not participate in Virtual Synchrony. >> >> Satellites can send/receive CPG messages and get quorum information but >> will not appear in >> the quorum nodes list. >> >> There must be a separate nodes list for satellites, probably maintained >> by a different subsystem. >> Satellites will have nodeIDs (required for CPG) that do not clash with >> the ring nodeids. >> >> >> Appearance to the user/admin >> ---------------------------- >> corosync.conf defines which nodes are satellites and which nodes to >> connect to (initially). May >> want some utility to force satellites to migrate from a node if it gets >> full. >> >> Future: Automatic configuration of who is in the VS cluster and who is a >> satellite. Load balancing. >> Maybe need 'preferred nodes' to avoid bad network topologies >> >> >> Potential problems >> ------------------ >> corosync uses a packet-based protocol, TCP is a stream (I don't see this >> as a big problem, TBH) >> Where to hook the message transmission in the corosync networking stack? >> - We don't need a lot of the totem messages >> - maybe hook into group 'a' and/or 'sync'(do we need 'sync' in >> satellites [CPG, so probably yes]?) >> Which is client/server? (if satellites are client with authkey we get >> easy failover and config, but ... DOS potential??) >> What if tcp buffers get full? Suggest just cutting off the node. >> How to stop satellites from running totemsrp? >> Fencing, do we need it? (pacemaker problem?) > > That has traditionally been the model and it still seems appropriate. > However Darren raises an interesting point... how will satellites know which is the "correct" partition to connect to? > > What would it look like if we flipped it around and had the full peers connecting to the satellites? > You could then tie that to having quorum. You also know that a fenced full peer wont have any connections. > Safety on two levels. > I think this is a better, if slightly more complex model to implement, yes. It also avoids the potential DoS of satellites trying to contact central cluster nodes repeatedly. Chrissie _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss