On Wed, Oct 23, 2019 at 9:56 AM Sage Weil <sweil@xxxxxxxxxx> wrote: > > I'm trying to implement MDS daemon management for mgr/ssh and am > confused by the intent of the orchestrator interface. > > - The add_mds() method takes a 'spec' StatelessServiceSpec that has > a ctor like > > def __init__(self, name, placement=None, count=None): > > but it is constructed only with a name: > > @_write_cli('orchestrator mds add', > "name=svc_arg,type=CephString", > 'Create an MDS service') > def _mds_add(self, svc_arg): > spec = orchestrator.StatelessServiceSpec(svc_arg) > > That means count=1 and placement is unspecified. That's fine for Rook, > sort of, as long as you want exactly 1 MDS for each file system. > > - Given that, can we rename the 'svg_arg' arg to 'name'? > > - The 'name' here, IIUC, is the name of the grouping of daemons. I think > it was intended to be a file system, as per the docs: > > The ``name`` parameter is an identifier of the group of instances: > > * a CephFS file system for a group of MDS daemons, > * a zone name for a group of RGWs > > but IIRC the new CephFS behavior is that all standby daemons go into the > same pool and are doled out to file systems that need them arbitrarily. > In that case, I think the only thing we would want to specify (in the rook > case where we don't pick daemon location) is the count of MDSs... and > then have a singel name grouping. Is that right for CephFS? Yes. One issue we need to consider is that when we have the mgr creating/deleting MDS daemons based on the needs of the file systems, we will need to delete a specific standby and not just any daemon. Otherwise, we have unnecessary failovers. Perhaps the MDS name should just be a random short string of letters and not identify a "group" of MDS daemons. > I have a > feeling it won't work for the other daemon types, though, like NFS > servers, which *do* care what they are serving up. > > - For SSH, none of that works, since we need to pass a location when > adding daemons. It seems like we want somethign closer to nfs_add, > which is > > @_write_cli('orchestrator nfs add', > "name=svc_arg,type=CephString " > "name=pool,type=CephString " > "name=namespace,type=CephString,req=false", > 'Create an NFS service') > > i.e., > > * 'add' takes a 'name' (the actual daemon name) and a location (if the > orch needs it). > * 'rm' takes the same name and removes it. > * 'update' does the smarts of adding ($want - $have) daemons for a > given group and generating names for them. Something else organizes these > into groups (a common name prefix?). I.e., 'update' basically builds on > 'add' and 'rm'. > > And/or, we introduce some basic scheduling into ssh orchestrator (or > orchestrator_cli). I'm not sure this is actually that smart since we can > probably get away with something quite simple: round-robin assignment of > daemons to hosts, and the ability to label nodes for a daemon type or > daemon type + grouping. This would basically give ssh orch what ansible > does as far as mapping out the deployment, and gracefully degrade to > something that "just works" (well enough) when you don't know/care > where things land. Obviously having a real scheduler like that in k8s > do this is better, but for non-kube deployments, there is still a need for > placing daemons to hosts to make things easy for the human operator. Agreed. -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx