Re: RFC: Extending corosync to high node counts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/04/15 04:44, Andrew Beekhof wrote:
> 
>> On 31 Mar 2015, at 12:25 am, Christine Caulfield <ccaulfie@xxxxxxxxxx> wrote:
>>
>> This is an updated document based on the responses I've had. Thank you
>> everyone.

>>    Which is client/server? (if satellites are client with authkey we
>> get easy failover and config, but ... DOS potential??)
> 
> satellites have to be the server.
> otherwise security and failure are a nightmare.


Agreed

>>
>>    How to 'fake' satellite node IDs in the CPG nodes list - will
>> probably need to extend the libcpg API.
>>
>>    do we need to add 'fake' join/leave events too?
>>
>>    What if tcp buffers get full? Suggest just cutting off the node.
>>
>>    Fencing, do we need it? (pacemaker problem?)
>>
>>    Keeping two node lists (totem/quorum and satellite) - duplicate node
>> IDs are not allowed and this will need to be enforced.
>>
>>    No real idea if this will scale as well as I hope it will!
>>
>>    GFS2 et al? is this needed/possible?
> 
> I’d not go there :)
> 

Wise advice :)

>>
>>    How (if at all) does knet fit into all this?
>>
>> How it will (possibly) work
>> ---------------------------
>> Have a separate daemon that runs on a corosync parent node and
>> communicates between the local corosync & its satellites
>> IDEA: Can we use the 'real' corosync libs and have a different server
>> back end on the satellites?
>> - reuse the corosync server-side IPC code
>>
>> CPG - would just be forwarded on to the parent with node ID 'fixed'
>> cmap - forwarded to parent corosync
>> quorum - keep own context
>> CFG - shutdown request as corosync cfg
>>
>> Need some API (or cmap?) for satellite node list
> 
> If you have the daemon make one connection per satellite (maybe by spawning a child for each one) they’d automatically show up in the CPG list.
> 

They will get a unique CPG connection yes, but they will share a nodeid,
The pid/nodeid pair will be unique though - would that be sufficient?

if that's the case then we probably don't even need to allocate extra
nodeids for satellites, just keep a list of nodeid/pid pairs. Which
makes things a lot easier!

>> Use a separate CPG for managing the satellites node list etc
>>
>> Does the satellite pacemaker/others need to know it is running on a
>> satellite?
>> - We can add a cmap key to hold this info.
>>
>> - joining
>>   It's best(*) for the parents to boot the satellites
>>     (*more secure, less DoS possibilities, more control)
>>     - do we poll for dead satellites? how often? how?(connect?, ping?)
>>     - CPG group to determine who is the parent of a satellite when a
>> parent leaves
>>        - allows easy failover & maintenance of node list
>>
>> - leaving
>>   If a TCP send fails or a socket is disconnected then the node is
>> summarily removed
>>   - there will probably also be a 'leave' message sent by the parent
>> for tidy removal
>>   - leave notifications are sent around the cluster so that the
>> secondary nodelist knows.
>>   - quorum does not need to know.
>>   - if a parent leaves then we need to send satellite node down
>> messages too (in the
>>     new service/private CPG) not for quorum, but for cpg clients.
>>
>> - failover
>>   When a parent fails or leaves, another suitable parent should contact
>> the orphaned satellites and try to include them back in the cluster.
>> Sone form of network topology might be nice here so the nearest parent
>> contacts the satellite.
>>   - also load balancing?
>>
>> Timescales
>> ----------
>> Nothing decided at this stage, probably Corosync 3.0 at the earliest.
>> Need to do a proof-of-concept, maybe using containers to get high node
>> count.
>>
>> Corosync services used by pacemaker (please check!)
>> ---------------------------------------------------
>> CPG  - obviously
>> CFG  - used to prevent corosync shutdown if pacemaker is running
>> cmap - Need to client-server this on a per-request basis
>>           used for nodelist and logging options AFAICT
>>           so mainly called at startup
>> quorum - including notification
> 
> looks right.  These are the headers I see us using:
> 
>  <corosync/cfg.h>
>  <corosync/cmap.h>
>  <corosync/confdb.h>
>  <corosync/corodefs.h>
>  <corosync/corotypes.h>
>  <corosync/cpg.h>
>  <corosync/engine/config.h>
>  <corosync/engine/objdb.h>
>  <corosync/hdb.h>
>  <corosync/quorum.h>
>  <corosync/totem/totempg.h>

Why are you using headers in engine/ and totem/ ? Those worry me,
they're internal. I hope it's just because of some deficiency in the
corosync headers below that.

Chrissie


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss





[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux