On Tue, 2023-01-31 at 16:34 +0000, Chuck Lever III wrote: > > > On Jan 31, 2023, at 9:42 AM, Andrew J. Romero <romero@xxxxxxxx> wrote: > > > > In a large campus environment, usage of the relevant memory pool will eventually get so > > high that a server-side reboot will be needed. > > The above is sticking with me a bit. > > Rebooting the server should force clients to re-establish state. > > Are they not re-establishing open file state for users whose > ticket has expired? I would think each client would re-establish > state for those open files anyway, and the server would be in the > same overcommitted state it was in before it rebooted. > > We might not have an accurate root cause analysis yet, or I could > be missing something. > My assumption was that the client wasn't able to get credentials to run the CLOSE RPC in this case, so it can't properly send the call. That's a big assumption though. It'd be good to confirm this. It looks like the CLOSE codepath on the client calls nfs4_state_protect with NFS_SP4_MACH_CRED_CLEANUP, and that should make it use the machine cred? I'm not 100% clear here though...it looks like that may be conditional on what was sent by the server in EXCHANGE_ID. FWIW, I don't see any reason we shouldn't use the machine cred for the close compound. Nothing we do in there should require permission checking. BTW: is this NFSv4.0 or v4.1+ (or a mix)? -- Jeff Layton <jlayton@xxxxxxxxxx>