Re: [PATCH] nfsd: don't fail OP_SETCLIENTID when there are lots of clients.

"NeilBrown" <neilb@xxxxxxx> · Thu, 25 Apr 2024 10:08:35 +1000

On Tue, 23 Apr 2024, Chuck Lever III wrote:
> 
> > On Apr 22, 2024, at 7:34 PM, NeilBrown <neilb@xxxxxxx> wrote:
> > 
> > On Mon, 22 Apr 2024, Chuck Lever wrote:
> >>> On Mon, Apr 22, 2024 at 12:09:19PM +1000, NeilBrown wrote:
> >>> The calculation of how many clients the nfs server can manage is only an
> >>> heuristic.  Triggering the laundromat to clean up old clients when we
> >>> have more than the heuristic limit is valid, but refusing to create new
> >>> clients is not.  Client creation should only fail if there really isn't
> >>> enough memory available.
> >>> 
> >>> This is not known to have caused a problem is production use, but
> >>> testing of lots of clients reports an error and it is not clear that
> >>> this error is justified.
> >> 
> >> It is justified, see 4271c2c08875 ("NFSD: limit the number of v4
> >> clients to 1024 per 1GB of system memory"). In cases like these,
> >> the recourse is to add more memory to the test system.
> > 
> > Does each client really need 1MB?
> > Obviously we don't want all memory to be used by client state....
> > 
> >> 
> >> However, that commit claims that the client is told to retry; I
> >> don't expect client creation to fail outright. Can you describe the
> >> failure mode you see?
> > 
> > The failure mode is repeated client retries that never succeed.  I think
> > an outright failure would be preferable - it would be more clear than
> > memory must be added.
> > 
> > The server has N active clients and M courtesy clients.
> > Triggering reclaim when N+M exceeds a limit and M>0 makes sense.
> > A hard failure (NFS4ERR_RESOURCE) when N exceeds a limit makes sense.
> > A soft failure (NFS4ERR_DELAY) while reclaim is running makes sense.
> > 
> > I don't think a retry while N exceeds the limit makes sense.
> 
> It’s not optimal, I agree.
> 
> NFSD has to limit the total number of active and courtesy
> clients, because otherwise it would be subject to an easy
> (d)DoS attack, which Dai demonstrated to me before I
> accepted his patch. A malicious actor or broken clients
> can continue to create leases on the server until it stops
> responding.
> 
> I think failing outright would accomplish the mitigation
> as well as delaying does, but delaying once or twice
> gives some slack that allows a mount attempt to succeed
> eventually even when the server temporarily exceeds the
> maximum client count.

I doubt that the set of active clients is so dynamic that it is worth
waiting in case some client goes away soon.  If we hit the limit then we
probably already have more clients than we can reasonably handle and it
is time to indicate failure.

> 
> Also IMO there could be a rate-limited pr_warn on the
> server that fires to indicate when a low-memory situation
> has been reached.

Yes, server side warnings would be a good idea.

> 
> The problem with NFS4ERR_RESOURCE, however, is that
> NFSv4.1 and newer do not have that status code. All
> versions of NFS have DELAY/JUKEBOX.

I didn't realise that.  Lots of code in nfs4xdr.c returns
nfserr_resource.  For v4.1 it appears to get translated to
nfserr_rep_too_big_too_cache or nfserr_rep_too_big, which might not
always make sense.

> 
> I recognize that you are tweaking only SETCLIENTID here,
> but I think behavior should be consistent for all minor
> versions of NFSv4.

I really want to change EXCHANGEID too.  Maybe we should use
NFS4ERR_SERVERFAULT.  It seems to be a catch-all for "some fatal error".
The server has failed to allocate required resources.

Thanks,
NeilBrown

> 
> 
> > Do we have a count of active vs courtesy clients?
> 
> Dai can correct me if I’m wrong, but I believe NFSD
> maintains a count of both.
> 
> But only the active leases really matter, becase
> courtesy clients can be dropped as memory becomes tight.
> Dropping an active lease would be somewhat more
> catastrophic.
> 
> 
> —
> Chuck Lever