> On Jun 15, 2022, at 11:39 AM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > >> On Jun 15, 2022, at 11:28 AM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote: >> >> A question about the coming rhashtable. >> >> Now multiple nfsd export share a cache pool. >> >> In the coming rhashtable, a nfsd export could use a private cache pool >> to improve scale out? > > That seems like a premature optimization. We don't know that the hashtable, > under normal (ie, non-generic/531) workloads, is a scaling problem. > > However, I am considering (in the future) creating separate filecaches > for each nfsd_net. So I'm not rejecting your idea outright. To expand on this a little: Just a single rhashtable will enable better scaling, and so will fixing the LRU issues, and those are both in plan for my current set of fixes. It's not clear to me that pathological workloads like generic/531 on NFSv4 are common, so it's quite possible that just these two changes will be enough for realistic workloads for the time being. My near term goal for generic/531 is to prevent it from crashing NFSD. Hopefully we can look at ways to enable that test to pass more often, and fail gracefully when it doesn't pass. The issue is how the server behaves when it can't open more files, which is somewhat separate from the data structure efficiency issues you and Frank pointed out. I'd like to get the above fixes ready for merge by the next window. So I'm trying to narrow the changes in this set of fixes to make sure they will be solid in a couple of months. It will be a heavier lift to go from just one to two filecaches per server. After that, it will likely be easier to go from two filecaches to multiple filecaches, but I feel that's down the road. In the medium term, supporting separate filecaches for NFSv3 and NFSv4 files is worth considering. NFSv3 nfsd_file items need to be managed automatically and can be subject to a shrinker since there's no client impact on releasing a cached filecache item. NFSv4 nfsd_file items manage themselves (via OPEN/CLOSE) so an LRU isn't really needed there (and isn't terribly effective anyway). A shrinker can't easily release NFSv4 nfsd_file items without the server losing state, and clients have to recover in that case. And, it turns out that the current filecache capacity-limiting mechanism forces NFSv3 items out of the filecache in favor of NFSv4 items when the cache has more that NFSD_FILE_LRU_LIMIT items in it. IMO that's obviously undesirable behavior for common mixed-version workloads. -- Chuck Lever