Re: [PATCH 13/14] nfsd: introduce concept of a maximum number of threads.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/16/2024 9:31 AM, Chuck Lever III wrote:


On Jul 16, 2024, at 7:00 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:

On Tue, 2024-07-16 at 13:21 +1000, NeilBrown wrote:
On Tue, 16 Jul 2024, Jeff Layton wrote:
On Mon, 2024-07-15 at 17:14 +1000, NeilBrown wrote:
A future patch will allow the number of threads in each nfsd pool to
vary dynamically.
The lower bound will be the number explicit requested via
/proc/fs/nfsd/threads or /proc/fs/nfsd/pool_threads

The upper bound can be set in each net-namespace by writing
/proc/fs/nfsd/max_threads.  This upper bound applies across all pools,
there is no per-pool upper limit.

If no upper bound is set, then one is calculated.  A global upper limit
is chosen based on amount of memory.  This limit only affects dynamic
changes. Static configuration can always over-ride it.

We track how many threads are configured in each net namespace, with the
max or the min.  We also track how many net namespaces have nfsd
configured with only a min, not a max.

The difference between the calculated max and the total allocation is
available to be shared among those namespaces which don't have a maximum
configured.  Within a namespace, the available share is distributed
equally across all pools.

In the common case there is one namespace and one pool.  A small number
of threads are configured as a minimum and no maximum is set.  In this
case the effective maximum will be directly based on total memory.
Approximately 8 per gigabyte.



Some of this may come across as bikeshedding, but I'd probably prefer
that this work a bit differently:

1/ I don't think we should enable this universally -- at least not
initially. What I'd prefer to see is a new pool_mode for the dynamic
threadpools (maybe call it "dynamic"). That gives us a clear opt-in
mechanism. Later once we're convinced it's safe, we can make "dynamic"
the default instead of "global".

2/ Rather than specifying a max_threads value separately, why not allow
the old threads/pool_threads interface to set the max and just have a
reasonable minimum setting (like the current default of 8). Since we're
growing the threadpool dynamically, I don't see why we need to have a
real configurable minimum.

3/ the dynamic pool-mode should probably be layered on top of the
pernode pool mode. IOW, in a NUMA configuration, we should split the
threads across NUMA nodes.

Maybe we should start by discussing the goal.  What do we want
configuration to look like when we finish?

I think we want it to be transparent.  Sysadmin does nothing, and it all
works perfectly.  Or as close to that as we can get.


That's a nice eventual goal, but what do we do if we make this change
and it's not behaving for them? We need some way for them to revert to
traditional behavior if the new mode isn't working well.

As Steve pointed out (privately) there are likely to be cases
where the dynamic thread count adjustment creates too many
threads or somehow triggers a DoS. Admins want the ability to
disable new features that cause trouble, and it is impossible
for us to to say truthfully that we have predicted every
misbehavior.

So +1 for having a mechanism for getting back the traditional
behavior, at least until we have confidence it is not going
to have troubling side-effects.

+1 on a configurable maximum as well, but I'll add a concern about
the NUMA node thing.

Not all CPU cores are created equal any more, there are "performance"
and "efficiency" (Atom) cores and there can be a big difference. Also
there are NUMA nodes with no CPUs at all, memory-only for example.
Then, CXL scrambles the topology again.

Let's not forget that these nfsd threads call into the filesystems,
which may desire very different NUMA affinities, for example the nfsd
protocol side may prefer to be near the network adapter, while the
filesystem side, the storage. And RDMA can bypass memory copy costs.

Thread count only addresses a fraction of these.

Yes, in a perfect world, fully autonomous thread count
adjustment would be amazing. Let's aim for that, but take
baby steps to get there.

Amazing indeed, and just as unlikely to be perfect. Caution is good.

Tom.




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux