Re: [PATCH v2 10/13] nfsd: move th_cnt into nfsd_net

Jeff Layton <jlayton@xxxxxxxxxx> · Fri, 26 Jan 2024 08:01:41 -0500

On Thu, 2024-01-25 at 16:56 -0500, Josef Bacik wrote:
> On Thu, Jan 25, 2024 at 04:01:27PM -0500, Chuck Lever wrote:
> > On Thu, Jan 25, 2024 at 02:53:20PM -0500, Josef Bacik wrote:
> > > This is the last global stat, move it into nfsd_net and adjust all the
> > > users to use that variant instead of the global one.
> > 
> > Hm. I thought nfsd threads were a global resource -- they service
> > all network namespaces. So, shouldn't the same thread count be
> > surfaced to all containers? Won't they all see all of the nfsd
> > processes?
> > 
> 

Each container is going to start /proc/fs/nfsd/threads number of threads
regardless. I hadn't actually grokked that they just get tossed onto the
pile of threads that service requests.

Is is possible for one container to start a small number of threads but
have its client load be such that it spills over and ends up stealing
threads from other containers?

> I don't think we want the network namespaces seeing how many threads exist in
> the entire system right?
> 
> Additionally it appears that we can have multiple threads per network namespace,
> so it's not like this will just show 1 for each individual nn, it'll show
> however many threads have been configured for that nfsd in that network
> namespace.
> 
> I'm good either way, but it makes sense to me to only surface the network
> namespace related thread count.  I could probably have a global counter and only
> surface the global counter if net == &init_net.  Let me know what you prefer.
> Thanks,
> 

The old stat meant "number of threads that are currently busy". The new
one will mean "number of threads that are busy servicing requests in
this namespace". The latter seems more useful on a per-ns basis.

In fact, it might be interesting to throw out some kind of warning when
the th_cnt exceeds the number of threads that you've allocated in the
container. That might be an indicator that you didn't spin up enough and
are poaching from other containers.

Or... maybe we should be queueing the RPC when that happens so that you
can't use more than you've allocated for the container?
-- 
Jeff Layton <jlayton@xxxxxxxxxx>