Re: orchestrator mds add|update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Am 23.10.19 um 18:56 schrieb Sage Weil:
> I'm trying to implement MDS daemon management for mgr/ssh and am 
> confused by the intent of the orchestrator interface.
> 
> - The add_mds() method takes a 'spec' StatelessServiceSpec that has 
> a ctor like
> 
>     def __init__(self, name, placement=None, count=None):
> 
> but it is constructed only with a name:
> 
>     @_write_cli('orchestrator mds add',
>                 "name=svc_arg,type=CephString",
>                 'Create an MDS service')
>     def _mds_add(self, svc_arg):
>         spec = orchestrator.StatelessServiceSpec(svc_arg)
> 
> That means count=1 and placement is unspecified.  That's fine for Rook, 
> sort of, as long as you want exactly 1 MDS for each file system.

Yep. Turns out StatelessServiceSpec is insufficient for all non-trivial
deployments and needed to be extended, like NFSServiceSpec or RGWSpec.
At some point, we might need an MDSSpec, too? Depends a bit on what an
MDS needs.

> 
> - Given that, can we rename the 'svg_arg' arg to 'name'?

I'm in the process of renaming `svg_arg`s to proper names, like `zone`.
Feel to rename this.

> 
> - The 'name' here, IIUC, is the name of the grouping of daemons.  I think 
> it was intended to be a file system, as per the docs:
> 
>  The ``name`` parameter is an identifier of the group of instances:
> 
>  * a CephFS file system for a group of MDS daemons,
>  * a zone name for a group of RGWs
> 
> but IIRC the new CephFS behavior is that all standby daemons go into the 
> same pool and are doled out to file systems that need them arbitrarily.  

We use this to set the name of the Rook CR, and this is afaik still
supposed to be the fs name.

> In that case, I think the only thing we would want to specify (in the rook 
> case where we don't pick daemon location) is the count of MDSs... and 
> then have a singel name grouping.  Is that right for CephFS?  I have a 
> feeling it won't work for the other daemon types, though, like NFS 
> servers, which *do* care what they are serving up.

https://rook.io/docs/rook/master/ceph-filesystem-crd.html -> "File
System Settings"

> 
> - For SSH, none of that works, since we need to pass a location when 
> adding daemons.  It seems like we want somethign closer to nfs_add, 
> which is
> 
>     @_write_cli('orchestrator nfs add',
>                 "name=svc_arg,type=CephString "
>                 "name=pool,type=CephString "
>                 "name=namespace,type=CephString,req=false",
>                 'Create an NFS service')
> 
> i.e.,
> 
>    * 'add' takes a 'name' (the actual daemon name) and a location (if the 
> orch needs it).

You could add a field `hosts` to PlacementSpec of type `List[str]`. We
need this for all stateless services for the ssh orch anyway. Let's add
it now.

>    * 'rm' takes the same name and removes it.

Yes

>    * 'update' does the smarts of adding ($want - $have) daemons for a 
> given group and generating names for them.  Something else organizes these 
> into groups (a common name prefix?).  I.e., 'update' basically builds on 
> 'add' and 'rm'.

add creates a completely new "service" (aka group of physical daemons).
Update is supposed to add and remove daemons from this service. `update`
is not supposed to call add or rm.

`add` creates a new "group of daemons" (aka "service")
`rm` removes the whole group of daemons
`update` deploys new daemons to an existing group.


> And/or, we introduce some basic scheduling into ssh orchestrator (or 
> orchestrator_cli).  I'm not sure this is actually that smart since we can 
> probably get away with something quite simple: round-robin assignment of 
> daemons to hosts,

I'd be +1 for keeping the deployment 100% predictable at this time.

> and the ability to label nodes for a daemon type or 
> daemon type + grouping.  This would basically give ssh orch what ansible 
> does as far as mapping out the deployment, and gracefully degrade to 
> something that "just works" (well enough) when you don't know/care 
> where things land.  Obviously having a real scheduler like that in k8s 
> do this is better, but for non-kube deployments, there is still a need for 
> placing daemons to hosts to make things easy for the human operator.

I'm not yet into pros, cons and use cases of different low level
scheduling mechanisms.

> 
> sage
> 

-- 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux