On 25/03/15 01:24, Steven Dake wrote: > I think if you dont care about performance, you can have a daemon > process (second process) connect as a cpg service and maintain an > overlay network on top of CPG. Then many other external endpoints could > connect to this server over TCP. That's an interesting idea that I quite like. And it might be nice and easy to get a proof-of-concept up and running. It would probably require a different API to the normal corosync one (I'm not sure that emulating libcpg etc for a different daemon would be sensible). How does that sound to the Pacemaker team? Chrissie > The problem with totem re scaling isn't virtual synchrony btw, it is the > membership protocol which creates a fully meshed network. Membership > protocols that maintain a mesh membership are expensive to setup but > cheap to maintain (regarding the network protocol activity) > I'm happy to see people are thinking about how to make corosync scale > past the historical ~30 node limit that seems to come in practice. > > Regards > -steve > . > > On Mon, Mar 23, 2015 at 2:09 AM, Christine Caulfield > <ccaulfie@xxxxxxxxxx <mailto:ccaulfie@xxxxxxxxxx>> wrote: > > On 23/03/15 03:11, Andrew Beekhof wrote: > > > >> On 19 Mar 2015, at 9:05 pm, Christine Caulfield > <ccaulfie@xxxxxxxxxx <mailto:ccaulfie@xxxxxxxxxx>> wrote: > >> > >> Extending corosync > >> ------------------ > >> > >> This is an idea that came out of several discussions at the cluster > >> summit in February. Please comment ! > >> > >> It is not meant to be a generalised solution to extending > corosync for > >> most users. For single & double digit cluster sizes the current ring > >> protocols should be sufficient. This is intended to make corosync > usable > >> over much larger node counts. > >> > >> The problem > >> ----------- > >> Corosync doesn't scale well to large numbers of nodes (60-100 to > 1000s) > >> This is mainly down to the requirements of virtual synchrony(VS) > and the > >> ring protocol. > >> > >> A proposed solution > >> ------------------- > >> Have 'satellite' nodes that are not part of the ring (and do not not > >> participate in VS). > >> They communicate via a single 'host' node over (possibly) TCP. > The host > >> sends the messages > >> to them in a 'send and forget' system - though TCP guaratees ordering > >> and delivery. > >> Host nodes can support many satellites. If a host goes down the > >> satellites can reconnect to > >> another node and carry on. > >> > >> Satellites have no votes, and do not participate in Virtual > Synchrony. > >> > >> Satellites can send/receive CPG messages and get quorum > information but > >> will not appear in > >> the quorum nodes list. > >> > >> There must be a separate nodes list for satellites, probably > maintained > >> by a different subsystem. > >> Satellites will have nodeIDs (required for CPG) that do not clash > with > >> the ring nodeids. > >> > >> > >> Appearance to the user/admin > >> ---------------------------- > >> corosync.conf defines which nodes are satellites and which nodes to > >> connect to (initially). May > >> want some utility to force satellites to migrate from a node if > it gets > >> full. > >> > >> Future: Automatic configuration of who is in the VS cluster and > who is a > >> satellite. Load balancing. > >> Maybe need 'preferred nodes' to avoid bad network topologies > >> > >> > >> Potential problems > >> ------------------ > >> corosync uses a packet-based protocol, TCP is a stream (I don't > see this > >> as a big problem, TBH) > >> Where to hook the message transmission in the corosync networking > stack? > >> - We don't need a lot of the totem messages > >> - maybe hook into group 'a' and/or 'sync'(do we need 'sync' in > >> satellites [CPG, so probably yes]?) > >> Which is client/server? (if satellites are client with authkey we get > >> easy failover and config, but ... DOS potential??) > >> What if tcp buffers get full? Suggest just cutting off the node. > >> How to stop satellites from running totemsrp? > >> Fencing, do we need it? (pacemaker problem?) > > > > That has traditionally been the model and it still seems appropriate. > > However Darren raises an interesting point... how will satellites > know which is the "correct" partition to connect to? > > > > What would it look like if we flipped it around and had the full > peers connecting to the satellites? > > You could then tie that to having quorum. You also know that a > fenced full peer wont have any connections. > > Safety on two levels. > > > > I think this is a better, if slightly more complex model to implement, > yes. It also avoids the potential DoS of satellites trying to contact > central cluster nodes repeatedly. > > Chrissie > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx <mailto:discuss@xxxxxxxxxxxx> > http://lists.corosync.org/mailman/listinfo/discuss > > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss