Re: how should we manage ganesha's export tables from ceph-mgr ?

Ricardo Dias <rdias@xxxxxxxx> · Fri, 4 Jan 2019 10:35:55 +0000

On 03/01/19 19:05, Jeff Layton wrote:
> Ricardo,
> 
> We chatted earlier about a new ceph mgr module that would spin up EXPORT
> blocks for ganesha and stuff them into a RADOS object. Here's basically
> what we're aiming for. I think it's pretty similar to what SuSE's
> solution is doing so I think it'd be good to collaborate here.

Just to make things more clear, We (SUSE) didn't implement a specific
downstream implementation. The implementation we developed targets the
upstream ceph-dashboard code.

The dashboard backend code to manage ganesha exports is almost done. We
still haven't opened a PR because we are still finishing the frontend
code, which might make the backend to change a bit.

The current code is located here:
https://github.com/rjfd/ceph/tree/wip-dashboard-nfs

> 
> Probably I should write this up in a doc somewhere, but here's what I'd
> envision. First an overview:
> 
> The Rook.io ceph+ganesha CRD basically spins up nfs-ganesha pods under
> k8s that don't export anything by default and have a fairly stock
> config. Each ganesha daemon that is started has a boilerplate config
> file that ends with a %url include like this:
> 
>     %url rados://<pool>/<namespace>/conf-<nodeid>
> 
> The nodeid in this case is the unique nodeid within a cluster of ganesha
> servers using the rados_cluster recovery backend in ganesha. Rook
> enumerates these starting with 'a' and going through 'z' (and then 'aa',
> 'ab', etc.). So node 'a' would have a config object called "conf-a".
> 

This was the same assumption we made, and the current implementation
code can manage the exports of different servers (configuration objects).

> What we currently lack is the code to set up those conf-<nodeid>
> objects. I know you have some code to do this sort of configuration via
> the dashboard and a REST API. Would it make more sense to split this bit
> out into a separate module, which would also allow it to be usable from
> the command line?

Yes, and no :) I think the benefit of splitting the code into a separate
module is on the possibility of other mgr modules to manage ganesha
exports using the mgr "_remote" call infrastructure, or if someone wants
to manage ganesha exports without enabling the dashboard module.

Regarding CLI commands, since the dashboard code exposes the export
management through a REST API, we can always use curl to call it
(although it will be a more verbose command).

In the dashboard source directory we have a small bash script to help
calling the REST API from the CLI. Here's an example of an export
creation using the current implementation:

$ ./run-backend-api-request.sh POST /api/nfs-ganesha/export \
  '{
     "hostname": "node1.domain",  \
     "path": "/foo", \
     "fsal": {"name": "CEPH", "user_id":"admin", "fs_name": "myfs"}, \

     "pseudo": "/foo", \
     "tag": null, \
     "access_type": "RW", \
     "squash": "no_root_squash", \
     "protocols":[4], \
     "transports": ["TCP"], \
     "clients": [{ \
       "addresses":["10.0.0.0/8"], \
       "access_type": "RO", \
       "squash": "root" \
     }]}'

The json fields and structure is similar to the ganesha export
configuration structure.

We also have other commands:

# list all exports
$ ./run-backend-api-request.sh GET /api/nfs-ganesha/export

# get an export
$ ./run-backend-api-request.sh GET \
	/api/nfs-ganesha/export/<hostname>/<id>

# update an export
$ ./run-backend-api-request.sh PUT \
	/api/nfs-ganesha/export/<hostname>/<id> <json string>

# delete an export
$ ./run-backend-api-request.sh DELETE \
	/api/nfs-ganesha/export/<hostname>/<id>

In the dashboard implementation, the server configuration is identified
by the <hostname> field, which does not need to be a real hostname.
The dashboard keeps a map between the hostname and the rados object URL
that stores the configuration of the server.

The bootstrap of this host/rados_url map can be done in two ways:
a) automatically: when an orchestrator backend is avaliable, the
dashboard asks the orchestrator for this information.
b) manually: the dashboard provides some CLI commands to add this
information. Example:
$  ceph dashboard ganesha-host-add <hostname> <rados_url>

> 
> My thinking was that we'd probably want to create a new mgr module for
> that, and could wire it up to the command line with something like:
> 
>     $ ceph nfsexport create --id=100			\
> 			--pool=mypool			\
> 			--namespace=mynamespace		\
> 			--type=cephfs			\
> 			--volume=myfs			\
> 			--subvolume=/foo		\
> 			--pseudo=/foo			\
> 			--cephx_userid=admin		\
> 			--cephx_key=<base64 key>	\
> 			--client=10.0.0.0/8,ro,root	\
> 			--client=admhost,rw,none
> 
> ...the "client" is a string that would be a tuple of client access
> string, r/o or r/w, and the userid squashing mode, and could be
> specified multiple times.

The above command is similar to what we provide in the REST API with the
difference that the dashboard generates the export ID.

Do you think it is important for the user to explicitly specify the
export ID?

> 
> We'd also want to add a way to remove and enumerate exports. Maybe:
> 
>     $ ceph nfsexport ls
>     $ ceph nfsexport rm --id=100
> 
> So the create command above would create an object called "export-100"
> in the given rados_pool/rados_namespace. 
> 
> From there, we'd need to also be able to "link" and "unlink" these
> export objects into the config files for each daemon. So if I have a
> cluster of 2 servers with nodeids "a" and "b":
> 
>     $ ceph nfsexport link --pool=mypool			\
> 			--namespace=mynamespace		\
> 			--id=100 			\
> 			--node=a			\
> 			--node=b
> 
> ...with a corresponding "unlink" command. That would append objects
> called "conf-a" and "conf-b" with this line:
> 
>     %url rados://mypool/mynamespace/export-100
> 
> ...and then call into the orchestrator to send a SIGHUP to the daemons
> to make them pick up the new configs. We might also want to sanity check
> whether any conf-* files are still linked to the export-* files before
> removing those objects.
> 
> Thoughts?

I got a bit lost with this link/unlink part. In the current dashboard
implementation, when we create an export the implementation will add the
export configuration into the rados://<pool>/<namespace>/conf-<nodeid>
object and call the orchestrator to update/restart the service.

It looks to me that you are separating the export creation from the
export deployment. First you create the export, and then you add it to
the service configuration.

We can also implement this two-step behavior in the dashboard
implementation and in the dashboard Web UI we can have a checkbox where
the user can specify if it wants to apply the new export right away or not.

In the dashboard, we will also implement a "copy" command to copy an
export configuration to another ganesha server. That will help with
creating similar exports in different servers.
Another option would be to instead of having a single "<hostname>" field
in the create export function, to have a list of <hostname>s.

> 

-- 
Ricardo Dias
Senior Software Engineer - Storage Team
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284
(AG Nürnberg)

Attachment:
signature.asc

Description: OpenPGP digital signature