Re: [RPC] nfsd: NFSv4 close a file completely

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Wed, 15 Jun 2022 18:57:28 +0000

> On Jun 15, 2022, at 11:39 AM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> 
>> On Jun 15, 2022, at 11:28 AM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote:
>> 
>> A question about the coming rhashtable.
>> 
>> Now multiple nfsd export share a cache pool.
>> 
>> In the coming rhashtable, a nfsd export could use a private cache pool
>> to improve scale out?
> 
> That seems like a premature optimization. We don't know that the hashtable,
> under normal (ie, non-generic/531) workloads, is a scaling problem.
> 
> However, I am considering (in the future) creating separate filecaches
> for each nfsd_net.

So I'm not rejecting your idea outright. To expand on this a little:

Just a single rhashtable will enable better scaling, and so will fixing
the LRU issues, and those are both in plan for my current set of fixes.
It's not clear to me that pathological workloads like generic/531 on
NFSv4 are common, so it's quite possible that just these two changes
will be enough for realistic workloads for the time being.

My near term goal for generic/531 is to prevent it from crashing NFSD.
Hopefully we can look at ways to enable that test to pass more often,
and fail gracefully when it doesn't pass. The issue is how the server
behaves when it can't open more files, which is somewhat separate from
the data structure efficiency issues you and Frank pointed out.

I'd like to get the above fixes ready for merge by the next window. So
I'm trying to narrow the changes in this set of fixes to make sure
they will be solid in a couple of months. It will be a heavier lift to
go from just one to two filecaches per server. After that, it will
likely be easier to go from two filecaches to multiple filecaches, but
I feel that's down the road.

In the medium term, supporting separate filecaches for NFSv3 and NFSv4
files is worth considering. NFSv3 nfsd_file items need to be managed
automatically and can be subject to a shrinker since there's no client
impact on releasing a cached filecache item.

NFSv4 nfsd_file items manage themselves (via OPEN/CLOSE) so an LRU isn't
really needed there (and isn't terribly effective anyway). A shrinker
can't easily release NFSv4 nfsd_file items without the server losing
state, and clients have to recover in that case.

And, it turns out that the current filecache capacity-limiting mechanism
forces NFSv3 items out of the filecache in favor of NFSv4 items when
the cache has more that NFSD_FILE_LRU_LIMIT items in it. IMO that's
obviously undesirable behavior for common mixed-version workloads.

--
Chuck Lever