On 13/04/15 04:44, Andrew Beekhof wrote: > >> On 31 Mar 2015, at 12:25 am, Christine Caulfield <ccaulfie@xxxxxxxxxx> wrote: >> >> This is an updated document based on the responses I've had. Thank you >> everyone. >> Which is client/server? (if satellites are client with authkey we >> get easy failover and config, but ... DOS potential??) > > satellites have to be the server. > otherwise security and failure are a nightmare. Agreed >> >> How to 'fake' satellite node IDs in the CPG nodes list - will >> probably need to extend the libcpg API. >> >> do we need to add 'fake' join/leave events too? >> >> What if tcp buffers get full? Suggest just cutting off the node. >> >> Fencing, do we need it? (pacemaker problem?) >> >> Keeping two node lists (totem/quorum and satellite) - duplicate node >> IDs are not allowed and this will need to be enforced. >> >> No real idea if this will scale as well as I hope it will! >> >> GFS2 et al? is this needed/possible? > > I’d not go there :) > Wise advice :) >> >> How (if at all) does knet fit into all this? >> >> How it will (possibly) work >> --------------------------- >> Have a separate daemon that runs on a corosync parent node and >> communicates between the local corosync & its satellites >> IDEA: Can we use the 'real' corosync libs and have a different server >> back end on the satellites? >> - reuse the corosync server-side IPC code >> >> CPG - would just be forwarded on to the parent with node ID 'fixed' >> cmap - forwarded to parent corosync >> quorum - keep own context >> CFG - shutdown request as corosync cfg >> >> Need some API (or cmap?) for satellite node list > > If you have the daemon make one connection per satellite (maybe by spawning a child for each one) they’d automatically show up in the CPG list. > They will get a unique CPG connection yes, but they will share a nodeid, The pid/nodeid pair will be unique though - would that be sufficient? if that's the case then we probably don't even need to allocate extra nodeids for satellites, just keep a list of nodeid/pid pairs. Which makes things a lot easier! >> Use a separate CPG for managing the satellites node list etc >> >> Does the satellite pacemaker/others need to know it is running on a >> satellite? >> - We can add a cmap key to hold this info. >> >> - joining >> It's best(*) for the parents to boot the satellites >> (*more secure, less DoS possibilities, more control) >> - do we poll for dead satellites? how often? how?(connect?, ping?) >> - CPG group to determine who is the parent of a satellite when a >> parent leaves >> - allows easy failover & maintenance of node list >> >> - leaving >> If a TCP send fails or a socket is disconnected then the node is >> summarily removed >> - there will probably also be a 'leave' message sent by the parent >> for tidy removal >> - leave notifications are sent around the cluster so that the >> secondary nodelist knows. >> - quorum does not need to know. >> - if a parent leaves then we need to send satellite node down >> messages too (in the >> new service/private CPG) not for quorum, but for cpg clients. >> >> - failover >> When a parent fails or leaves, another suitable parent should contact >> the orphaned satellites and try to include them back in the cluster. >> Sone form of network topology might be nice here so the nearest parent >> contacts the satellite. >> - also load balancing? >> >> Timescales >> ---------- >> Nothing decided at this stage, probably Corosync 3.0 at the earliest. >> Need to do a proof-of-concept, maybe using containers to get high node >> count. >> >> Corosync services used by pacemaker (please check!) >> --------------------------------------------------- >> CPG - obviously >> CFG - used to prevent corosync shutdown if pacemaker is running >> cmap - Need to client-server this on a per-request basis >> used for nodelist and logging options AFAICT >> so mainly called at startup >> quorum - including notification > > looks right. These are the headers I see us using: > > <corosync/cfg.h> > <corosync/cmap.h> > <corosync/confdb.h> > <corosync/corodefs.h> > <corosync/corotypes.h> > <corosync/cpg.h> > <corosync/engine/config.h> > <corosync/engine/objdb.h> > <corosync/hdb.h> > <corosync/quorum.h> > <corosync/totem/totempg.h> Why are you using headers in engine/ and totem/ ? Those worry me, they're internal. I hope it's just because of some deficiency in the corosync headers below that. Chrissie _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss