> On Jan 24, 2024, at 7:06 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2024-01-24 at 18:47 -0500, Chuck Lever wrote: >> On Wed, Jan 24, 2024 at 06:41:27PM -0500, Jeff Layton wrote: >>> On Wed, 2024-01-24 at 18:18 -0500, Josef Bacik wrote: >>>> On Wed, Jan 24, 2024 at 05:57:06PM -0500, Jeff Layton wrote: >>>>> On Wed, 2024-01-24 at 17:12 -0500, Josef Bacik wrote: >>>>>> On Wed, Jan 24, 2024 at 03:32:06PM -0500, Chuck Lever wrote: >>>>>>> On Wed, Jan 24, 2024 at 02:37:00PM -0500, Josef Bacik wrote: >>>>>>>> We are running nfsd servers inside of containers with their own network >>>>>>>> namespace, and we want to monitor these services using the stats found >>>>>>>> in /proc. However these are not exposed in the proc inside of the >>>>>>>> container, so we have to bind mount the host /proc into our containers >>>>>>>> to get at this information. >>>>>>>> >>>>>>>> Separate out the stat counters init and the proc registration, and move >>>>>>>> the proc registration into the pernet operations entry and exit points >>>>>>>> so that these stats can be exposed inside of network namespaces. >>>>>>> >>>>>>> Maybe I missed something, but this looks like it exposes the global >>>>>>> stat counters to all net namespaces...? Is that an information leak? >>>>>>> As an administrator I might be surprised by that behavior. >>>>>>> >>>>>>> Seems like this patch needs to make nfsdstats and nfsd_svcstats into >>>>>>> per-namespace objects as well. >>>>>>> >>>>>>> >>>>>> >>>>>> I've got the patches written for this, but I've got a question. There's a >>>>>> >>>>>> svc_seq_show(seq, &nfsd_svcstats); >>>>>> >>>>>> in nfsd/stats.c. This appears to be an empty struct, there's nothing that >>>>>> utilizes it, so this is always going to print 0 right? There's a svc_info in >>>>>> the nfsd_net, and that stats block appears to get updated properly. Should I >>>>>> print this out here? I don't see anywhere we get the rpc stats out of nfsd, am >>>>>> I missing something? I don't want to rip out stuff that I don't quite >>>>>> understand. Thanks, >>>>>> >>>>>> >>>>> >>>>> nfsd_svcstats ends up being the sv_stats for the nfsd service. The RPC >>>>> code has some counters in there for counting different sorts of net and >>>>> rpc events (see svc_process_common, and some of the recv and accept >>>>> handlers). I think nfsstat(8) may fetch that info via the above >>>>> seqfile, so it's definitely not unused (and it should be printing more >>>>> than just a '0'). >>>> >>>> Ahhh, I missed this bit >>>> >>>> struct svc_program nfsd_program = { >>>> #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) >>>> .pg_next = &nfsd_acl_program, >>>> #endif >>>> .pg_prog = NFS_PROGRAM, /* program number */ >>>> .pg_nvers = NFSD_NRVERS, /* nr of entries in >>>> nfsd_version */ >>>> .pg_vers = nfsd_version, /* version table */ >>>> .pg_name = "nfsd", /* program name */ >>>> .pg_class = "nfsd", /* authentication class >>>> */ >>>> .pg_stats = &nfsd_svcstats, /* version table */ >>>> .pg_authenticate = &svc_set_client, /* export authentication >>>> */ >>>> .pg_init_request = nfsd_init_request, >>>> .pg_rpcbind_set = nfsd_rpcbind_set, >>>> }; >>>> >>>> and so nfsd_svcstats definitely is getting used. >>>> >>>>> >>>>> svc_info is a completely different thing: it's a container for the >>>>> svc_serv...so I'm not sure I understand your question? >>>> >>>> I was just confused, and still am a little bit. >>>> >>>> The counters are easy, I put those into the nfsd_net struct and make everything >>>> mess with those counters and report those from proc. >>>> >>>> However the nfsd_svcstats are in this svc_program thing, which appears to need >>>> to be global? Or do I need to make it per net as well? Or do I need to do >>>> something completely different to track the rpc stats per network namespace? >>> >>> Making the svc_program per-net is unnecessary for this (and probably not >>> desirable). That structure sort of describes the nfsd rpc "program" and >>> that is pretty much the same between containers. >> >> Maybe we want per-namespace svc_programs. Some RPC programs will >> be registered in some namespaces, some in others? That might be >> the simplest approach. >> > > That seems like a much heavier lift, and I'm not sure I see the benefit. > Here's nfsd_program today: > > struct svc_program nfsd_program = { > #if defined(CONFIG_NFSD_V2_ACL) || defined(CONFIG_NFSD_V3_ACL) > .pg_next = &nfsd_acl_program, > #endif > .pg_prog = NFS_PROGRAM, /* program number */ > .pg_nvers = NFSD_NRVERS, /* nr of entries in nfsd_version */ > .pg_vers = nfsd_version, /* version table */ > .pg_name = "nfsd", /* program name */ > .pg_class = "nfsd", /* authentication class */ > .pg_stats = &nfsd_svcstats, /* version table */ > .pg_authenticate = &svc_set_client, /* export authentication */ > .pg_init_request = nfsd_init_request, > .pg_rpcbind_set = nfsd_rpcbind_set, > }; > > All of that seems fairly constant across containers. The main exception > is the svc_stats, which does need to be per-program and per-net, at > least for nfsd. This would be the benefit right here: the stats need to be matrixed per program and per net. The stats and the program (set of RPC procedures) are pretty tightly interconnected. Some namespaces might want NFSv2 or NFSv3, some might want NFSv4 only. But I don't have a strong opinion about it at this point. You could be right that this would be the more obtuse approach. There is one fixed definition for each RPC program, so having one svc_program per RPC program, and having each live in global module memory, is sensible. The way stats work now is from a long-ago era. > FWIW, looking at the other services that set pg_stats, none of them have > a way to actually report them! They are write-only. We should probably > make the others just set pg_stats to NULL so we don't bother > incrementing on them. > > That should simplify reworking how this works for nfsd too... True, but I've been told that having NLM RPC stats available would be helpful for distro support teams. It would be cool to delete some lines of code, but we should ask around before tossing out this unused infrastructure. In fact, Josef: what do you think? Would having NLM stats for your NFSv3 servers be helpful? -- Chuck Lever