Re: [PATCH] nfsd: don't fail OP_SETCLIENTID when there are lots of clients.

Dai Ngo <dai.ngo@xxxxxxxxxx> · Tue, 23 Apr 2024 11:02:27 -0700

On 4/23/24 6:15 AM, Chuck Lever III wrote:
On Apr 22, 2024, at 7:34 PM, NeilBrown <neilb@xxxxxxx> wrote:

On Mon, 22 Apr 2024, Chuck Lever wrote:
On Mon, Apr 22, 2024 at 12:09:19PM +1000, NeilBrown wrote:
The calculation of how many clients the nfs server can manage is only an
heuristic.  Triggering the laundromat to clean up old clients when we
have more than the heuristic limit is valid, but refusing to create new
clients is not.  Client creation should only fail if there really isn't
enough memory available.

This is not known to have caused a problem is production use, but
testing of lots of clients reports an error and it is not clear that
this error is justified.
It is justified, see 4271c2c08875 ("NFSD: limit the number of v4
clients to 1024 per 1GB of system memory"). In cases like these,
the recourse is to add more memory to the test system.
Does each client really need 1MB?
Obviously we don't want all memory to be used by client state....

However, that commit claims that the client is told to retry; I
don't expect client creation to fail outright. Can you describe the
failure mode you see?
The failure mode is repeated client retries that never succeed.  I think
an outright failure would be preferable - it would be more clear than
memory must be added.

The server has N active clients and M courtesy clients.
Triggering reclaim when N+M exceeds a limit and M>0 makes sense.
A hard failure (NFS4ERR_RESOURCE) when N exceeds a limit makes sense.
A soft failure (NFS4ERR_DELAY) while reclaim is running makes sense.

I don't think a retry while N exceeds the limit makes sense.
It’s not optimal, I agree.

NFSD has to limit the total number of active and courtesy
clients, because otherwise it would be subject to an easy
(d)DoS attack, which Dai demonstrated to me before I
accepted his patch. A malicious actor or broken clients
can continue to create leases on the server until it stops
responding.

I think failing outright would accomplish the mitigation
as well as delaying does, but delaying once or twice
gives some slack that allows a mount attempt to succeed
eventually even when the server temporarily exceeds the
maximum client count.

Also IMO there could be a rate-limited pr_warn on the
server that fires to indicate when a low-memory situation
has been reached.

The problem with NFS4ERR_RESOURCE, however, is that
NFSv4.1 and newer do not have that status code. All
versions of NFS have DELAY/JUKEBOX.

I recognize that you are tweaking only SETCLIENTID here,
but I think behavior should be consistent for all minor
versions of NFSv4.

Do we have a count of active vs courtesy clients?
Dai can correct me if I’m wrong, but I believe NFSD
maintains a count of both.

NFSD maintains both counts for active clients, nfs4_client_count,
and courtesy clients, nfsd_courtesy_clients. However the 'real'
active client count is 'nfs4_client_count - nfsd_courtesy_clients).

But only the active leases really matter, becase
courtesy clients can be dropped as memory becomes tight.

Yes, when the NFSD shrinker is activated it calls courtesy_client_reaper
to remove courtesy clients.

-Dai

Dropping an active lease would be somewhat more
catastrophic.

—
Chuck Lever