orchestrator mds add|update

Sage Weil <sweil@xxxxxxxxxx> · Wed, 23 Oct 2019 16:56:06 +0000 (UTC)

I'm trying to implement MDS daemon management for mgr/ssh and am 
confused by the intent of the orchestrator interface.

- The add_mds() method takes a 'spec' StatelessServiceSpec that has 
a ctor like

    def __init__(self, name, placement=None, count=None):

but it is constructed only with a name:

    @_write_cli('orchestrator mds add',
                "name=svc_arg,type=CephString",
                'Create an MDS service')
    def _mds_add(self, svc_arg):
        spec = orchestrator.StatelessServiceSpec(svc_arg)

That means count=1 and placement is unspecified.  That's fine for Rook, 
sort of, as long as you want exactly 1 MDS for each file system.

- Given that, can we rename the 'svg_arg' arg to 'name'?

- The 'name' here, IIUC, is the name of the grouping of daemons.  I think 
it was intended to be a file system, as per the docs:

 The ``name`` parameter is an identifier of the group of instances:

 * a CephFS file system for a group of MDS daemons,
 * a zone name for a group of RGWs

but IIRC the new CephFS behavior is that all standby daemons go into the 
same pool and are doled out to file systems that need them arbitrarily.  
In that case, I think the only thing we would want to specify (in the rook 
case where we don't pick daemon location) is the count of MDSs... and 
then have a singel name grouping.  Is that right for CephFS?  I have a 
feeling it won't work for the other daemon types, though, like NFS 
servers, which *do* care what they are serving up.

- For SSH, none of that works, since we need to pass a location when 
adding daemons.  It seems like we want somethign closer to nfs_add, 
which is

    @_write_cli('orchestrator nfs add',
                "name=svc_arg,type=CephString "
                "name=pool,type=CephString "
                "name=namespace,type=CephString,req=false",
                'Create an NFS service')

i.e.,

   * 'add' takes a 'name' (the actual daemon name) and a location (if the 
orch needs it).
   * 'rm' takes the same name and removes it.
   * 'update' does the smarts of adding ($want - $have) daemons for a 
given group and generating names for them.  Something else organizes these 
into groups (a common name prefix?).  I.e., 'update' basically builds on 
'add' and 'rm'.

And/or, we introduce some basic scheduling into ssh orchestrator (or 
orchestrator_cli).  I'm not sure this is actually that smart since we can 
probably get away with something quite simple: round-robin assignment of 
daemons to hosts, and the ability to label nodes for a daemon type or 
daemon type + grouping.  This would basically give ssh orch what ansible 
does as far as mapping out the deployment, and gracefully degrade to 
something that "just works" (well enough) when you don't know/care 
where things land.  Obviously having a real scheduler like that in k8s 
do this is better, but for non-kube deployments, there is still a need for 
placing daemons to hosts to make things easy for the human operator.

sage
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx