Howdy Neil - On 2/5/25 6:04 PM, NeilBrown wrote: > > Hi Chuck, > what are you current thoughts on merging this series? I think NFSD is better off trying to make list_lru work, as that is a broadly shared mechanism that has had a lot of review and testing. So far there hasn't been a lot of evidence that the proposed replacement is significantly more efficient; it's just different. I generally agree with Dave's sentiment: > Using common infrastructure, even when it's not an exact perfect > fit, holds a lot more value to the wider community than a one-off > special snowflake implementation that only one or two people > understand.... LRU for the NFSD filecache is a case where IMO code re-use /is/ valuable. One of my peeves about the filecache is that it is buried inside of NFSD so deeply that it is well-nigh impossible to build unit testing for it. I'd rather not add more special cases in this area of the code unless there is a palpable benefit. > One of my thoughts is that I now realise that > > Commit 0758a7212628 ("nfsd: drop the lock during filecache LRU scans") > > in nfsd-next is bad. If there are multiple nodes (and so multiple > sublits in the list_lru) and if there are more than a few dozen files in > the lru, then that patch results in the first sublist being completely > freed before anything is done to the next. > I think the best fix for backporting to -stable is to wrap a > for_each_node_state((nid) around the while loop and using > list_lru_count_node() and list_lru_walk_node(). > > I could send a SQUASH patch for that and rebase this series on it. Yeah, proper NUMA sensitivity is important, I feel. Please post a full replacement. nfsd-testing is a topic branch, so the current revision of 0758a7212628 can be replaced wholesale. > Thanks, > NeilBrown > > > On Mon, 27 Jan 2025, NeilBrown wrote: >> [ >> davec added to cc incase I've said something incorrect about list_lru >> >> Changes in this version: >> - no _bh locking >> - add name for a magic constant >> - remove unnecessary race-handling code >> - give a more meaningfule name for a lock for /proc/lock_stat >> - minor cleanups suggested by Jeff >> >> ] >> >> The nfsd filecache currently uses list_lru for tracking files recently >> used in NFSv3 requests which need to be "garbage collected" when they >> have becoming idle - unused for 2-4 seconds. >> >> I do not believe list_lru is a good tool for this. It does not allow >> the timeout which filecache requires so we have to add a timeout >> mechanism which holds the list_lru lock while the whole list is scanned >> looking for entries that haven't been recently accessed. When the list >> is largish (even a few hundred) this can block new requests noticably >> which need the lock to remove a file to access it. >> >> This patch removes the list_lru and instead uses 2 simple linked lists. >> When a file is accessed it is removed from whichever list it is on, >> then added to the tail of the first list. Every 2 seconds the second >> list is moved to the "freeme" list and the first list is moved to the >> second list. This avoids any need to walk a list to find old entries. >> >> These lists are per-netns rather than global as the freeme list is >> per-netns as the actual freeing is done in nfsd threads which are >> per-netns. >> >> Thanks, >> NeilBrown >> >> [PATCH 1/7] nfsd: filecache: remove race handling. >> [PATCH 2/7] nfsd: filecache: use nfsd_file_dispose_list() in >> [PATCH 3/7] nfsd: filecache: move globals nfsd_file_lru and >> [PATCH 4/7] nfsd: filecache: change garbage collection list >> [PATCH 5/7] nfsd: filecache: document the arbitrary limit on >> [PATCH 6/7] nfsd: filecache: change garbage collection to a timer. >> [PATCH 7/7] nfsd: filecache: give disposal lock a unique class name. >> >> >> > -- Chuck Lever