Re: dynamic management of nfsd thread count

"NeilBrown" <neilb@xxxxxxx> · Tue, 25 Jan 2022 10:46:18 +1100

On Tue, 25 Jan 2022, Chuck Lever III wrote:
> Hi Neil-
> 
> The clean-up of NFSD thread management has been merged into v5.17-rc.
> I'm wondering if you have thought more about how to implement dynamic
> management.

I have some scraps of code but nothing much.
My plan is to have threads exit after they have been around for "a
while" and have new threads be created when there is "apparent need".

I think there always needs to be at least one thread (as thread creation
can fail, or block for a long time), but if there are > 1 and a thread
finds that it has nothing immediately to do, and it has been over (e.g.)
5 minutes since it was created, then it exits.  So there will always be
gentle downward pressure on the number of threads.

When data arrives on a transport and there is no thread available to
process it, then we need to consider starting a new thread.  This is by
far the most interesting part.  I think we want a rate limit as a
function of the number of threads already existing.
e.g.: 
  if there are fewer than "nr_threads" threads, then start a thread.
  if there are between "nr_threads" and "8*nr_threads", then schedule a thread to
     start in 10ms unless that has already been scheduled
  if there are more than "8*nr_threads", then don't start new threads

Obviously there are some magic numbers in there that need to be
justified, and I don't have any justification as yet.
The "10ms" should ideally be related to the typical request-processing
time (mean? max? mid-range?)
The "8*nr_threads" should probably be separately configurable, with
something like that value as a default.

So with the default of 4 (or 8?) threads, the count will quickly grow to
that number if there is any load, then slowly grow beyond that as load
persists.
It would be nice to find a way to measure if the nfsd threads are
overloading the system in some way, but I haven't yet thought about what
that might mean.

> 
> One thing that occurred to me was that there was a changeset that
> attempted to convert NFSD threads into work queues. We merged some
> of that but stopped when it was found that work queue scheduling
> seems to have some worst case behavior that sometimes impacts the
> performance of the Linux NFS server.
> 
> The part of that work that was merged is now vestigial, and might
> need to be reverted before proceeding with dynamic NFSD thread
> management. For example:
> 
>  476 void svc_xprt_enqueue(struct svc_xprt *xprt)
>  477 {
>  478         if (test_bit(XPT_BUSY, &xprt->xpt_flags))
>  479                 return;
>  480         xprt->xpt_server->sv_ops->svo_enqueue_xprt(xprt);
>  481 }
>  482 EXPORT_SYMBOL_GPL(svc_xprt_enqueue);
> 
> The svo_enqueue_xprt method was to be different for the different
> scheduling mechanisms. But to this day there is only one method
> defined: svc_xprt_do_enqueue()
> 
> Do you have plans to use this or discard it? If not, I'd like to
> see that it is removed eventually, as virtual functions are
> more costly than they used to be thanks to Spectre/Meltdown, and
> this one is invoked multiple times per RPC.

I think this can go - I see no use for it.

I suspect svo_module can go as well - I don't think the thread is ever
the thing that primarily keeps a module active.
svo_shutdown can go too.  any code that calls svc_shutdown_net() knows
what the shutdown function should be, and so can call it directly.

svo_function is useful, but as it is the only thing in sv_ops that is
needed, we can drop the sv_ops structure and just have a pointer to the
svo_function directly in the svc_serv.

I'm happy to provide patches for all this if you like.

NeilBrown