Re: [PATCH] NFSD: add version field to nfsd_rpc_status_show handler

"NeilBrown" <neilb@xxxxxxx> · Wed, 09 Aug 2023 11:29:57 +1000

On Wed, 09 Aug 2023, Chuck Lever wrote:
> On Wed, Aug 09, 2023 at 07:45:06AM +1000, NeilBrown wrote:
> > On Wed, 09 Aug 2023, Chuck Lever wrote:
> > > On Tue, Aug 08, 2023 at 10:20:44AM -0400, Jeff Layton wrote:
> > > > 
> > > > It would probably be fairly simple to output well-formed yaml instead.
> > > > JSON and XML are a bit more of a pain.
> > > 
> > > If folks don't mind, I would like more structured output like one of
> > > these self-documenting formats. (I know I said I didn't care before,
> > > but I'm beginning to care now ;-)
> > 
> > Lustre, which I am somewhat involved with, uses YAML for various things.
> > If someone else introduced yaml-producing sysfs files to the kernel
> > first, that might make the path for lustre smoother :-)
> 
> It worries me that there isn't yet kernel infrastructure for
> formating yaml in sysfs files. That broadens the scope of this
> work significantly.
> 
> 
> > Another option is netlink which lustre is stating to use for
> > configuration and stats.  It is a self-describing format.  The code
> > looks verbose, but it is widely used in the kernel and so well supported.
> 
> I just spent the last 6 months building a netlink upcall to handle
> TLS handshake requests for in-kernel TLS consumers. It is built on
> the recently-added yaml netlink specs and code generator. The yaml
> netlink specs are kept under:
> 
>   Documentation/netlink/specs/
> 
> Using netlink would give us a lot of infrastructure for this
> facility, but I'm not sure it's worth the extra complexity. And it
> would /require/ the use of user space tooling (ie, not 'cat') to get
> to the information exported from the kernel. <shrug>
> 

I do like the "cat" approach.  Unfortunately it doesn't scale and you
never really know when it needs to scale.
The nfsd/rpc cache.c auth cache is still a sore point for me.  It works
nicely expect that it breaks for gss because the keys get too big.  So
we've had a couple of attempts to "fix" that.  The fixes work, but they
are *different*.

The other well known pain point is /proc/mounts.  It is really cool that
you can "cat" that, but with thousands of mounts it can take tools like
systemd a long time to find changes.

does any of that matter for collecting stats?  Will we hit a wall?  I
really don't know.  I'd like to think that we won't....

> 
> > > I'm also wondering if we really ought not add another file under
> > > /proc, which is essentially obsolete. Would /sys/fs/nfsd/yada be
> > > better for this facility?
> > 
> > It is only under /proc because that is where it is mounted by default :-)
> > I think it might be sensible to create a node under /sys where all the
> > content of the nfsd filesystem also appears.
> 
> There are things in the nfsd filesystem that really belong under
> /proc/net/rpc or elsewhere, so IMO such migration needs to be
> handled on a case-by-case basis -- different project for another
> time.

abolutely.

> 
> 
> > I'm not keen on /sys/fs/nfsd because nfsd isn't a filesystem, it is a
> > service.
> 
> How about /sys/module/nfsd ?

Not worse than /sys/fs/nfsd.  Not really better though.
Everything in /sys/module/foo is about the module as a chunk of code -
except "parameters".  I guess we could add "state".

Maybe configfs is the thing ...  but I never liked configfs.  It seems
like a solution in search of a problem.

I complained that /sys/fs is like the provfs-v2.  It is more that
everything other than devices (and block,bus,class) is procfs-v2.
There are little bits of regularity, but no big-picture.

I would probably argue for /sys/services/sunrpc/{nfsd,lockd,nfsv4.1-cb}

Alternately, we could go for /sys/devices/virtual/sunrpc.  The virtual
directory contains "workqueue" which is a service in some sense, and
contains subdirectories for specific work-queues.
I'm not sure that all of the nfsd stuff would fit under there...

but maybe I'm over-thinking things.

Thanks,
NeilBrown

> 
> 
> > > I hesitate to even mention network namespaces...
> > 
> > Please do mention them - I find them too easy to forget about.
> > /proc/fs/nfsd/ inherits the network namespace from whoever mounts it.
> > So this can work perfectly.
> > If we created a mirror in /sys/ we would presumably use the namespace of
> > the process that opens the file.
> 
> I agree: the network namespace of the process that opens the
> rpc_status file is just what we want to limit access to in-flight
> requests. The current network namespace of each thread is available
> via SVC_NET(rqst), so it should be quite simple to display only
> in-flight requests that match the opener's namespace.
> 
> 
> -- 
> Chuck Lever
>