> On May 28, 2022, at 10:32 PM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote: > > Hi, > >>> On May 27, 2022, at 4:37 PM, Frank van der Linden <fllinden@xxxxxxxxxx> wrote: >>> >>> On Fri, May 27, 2022 at 06:59:47PM +0000, Chuck Lever III wrote: >>>> >>>> Hi Frank- >>>> >>>> Bruce recently reminded me about this issue. Is there a bugzilla somewhere? >>>> Do you have a reproducer I can try? >>> >>> Hi Chuck, >>> >>> The easiest way to reproduce the issue is to run generic/531 over an >>> NFSv4 mount, using a system with a larger number of CPUs on the client >>> side (or just scaling the test up manually - it has a calculation based >>> on the number of CPUs). >>> >>> The test will take a long time to finish. I initially described the >>> details here: >>> >>> https://lore.kernel.org/linux-nfs/20200608192122.GA19171@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ >>> >>> Since then, it was also reported here: >>> >>> https://lore.kernel.org/all/20210531125948.2D37.409509F4@xxxxxxxxxxxx/T/#m8c3e4173696e17a9d5903d2a619550f352314d20 >> >> Thanks for the summary. So, there isn't a bugzilla tracking this >> issue? If not, please create one here: >> >> https://bugzilla.linux-nfs.org/ >> >> Then we don't have to keep asking for a repeat summary ;-) >> >> >>> I posted an experimental patch, but it's actually not quite correct >>> (although I think the idea behind it is makes sense): >>> >>> https://lore.kernel.org/linux-nfs/20201020183718.14618-4-trondmy@xxxxxxxxxx/T/#m869aa427f125afee2af9a89d569c6b98e12e516f >> >> A handful of random comments: >> >> - nfsd_file_put() is now significantly different than it used >> to be, so that part of the patch will have to be updated in >> order to apply to v5.18+ > > When many open files(>NFSD_FILE_LRU_THRESHOLD), > nfsd_file_gc() will waste many CPU times. Thanks for the suggestion. I agree that CPU and memory bandwidth are not being used effectively by the filecache garbage collector. > Can we serialize nfsd_file_gc() for v5.18+ as a first step? If I understand Frank's problem statement correctly, garbage collection during an nfsd_file_put() under an NFSv4-only workload walks the length of the LRU list and finds nothing to evict 100% of the time. That seems like a bug, and fixing it might give us the most improvement in this area. > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > index 3d944ca..6abefd9 100644 > --- a/fs/nfsd/filecache.c > +++ b/fs/nfsd/filecache.c > @@ -63,6 +63,8 @@ static struct delayed_work nfsd_filecache_laundrette; > > static void nfsd_file_gc(void); > > +static atomic_t nfsd_file_gc_queue_delayed = ATOMIC_INIT(0); > + > static void > nfsd_file_schedule_laundrette(void) > { > @@ -71,8 +73,10 @@ nfsd_file_schedule_laundrette(void) > if (count == 0 || test_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags)) > return; > > - queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > + if(atomic_cmpxchg(&nfsd_file_gc_queue_delayed, 0, 1)==0){ > + queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > NFSD_LAUNDRETTE_DELAY); I might misunderstand what this is doing exactly. I'm sure there's a preferred idiom in the kernel for not queuing a new work item when one is already running, so that an open-coded cmpxchg is not necessary. It might be better to allocate a specific workqueue for filecache garbage collection, and limit the maximum number of active work items allowed on that queue to 1. One benefit of that might be reducing the number of threads contending for the filecache data structures. If GC is capped like this, maybe create one GC workqueue per nfsd_net so GC activity in one namespace does not starve filecache GC in other namespaces. Before I would take patches like this, though, performance data demonstrating a problem and some improvement should be presented separately or as part of the patch descriptions. If you repost, start a separate thread and cc: M: Tejun Heo <tj@xxxxxxxxxx> R: Lai Jiangshan <jiangshanlai@xxxxxxxxx> to get review by workqueue domain experts. > + } > } > > static void > @@ -468,15 +472,22 @@ nfsd_file_lru_walk_list(struct shrink_control *sc) > return ret; > } > > +// nfsd_file_gc() support concurrency, but serialize it > +atomic_t nfsd_file_gc_running = ATOMIC_INIT(0); > static void > nfsd_file_gc(void) > { > - nfsd_file_lru_walk_list(NULL); > + if(atomic_cmpxchg(&nfsd_file_gc_running, 0, 1)==0){ > + nfsd_file_lru_walk_list(NULL); > + atomic_set(&nfsd_file_gc_running, 0); > + } > } > > static void > nfsd_file_gc_worker(struct work_struct *work) > { > + atomic_set(&nfsd_file_gc_queue_delayed, 0); > + // pr_info("nfsd_file_gc_worker\n"); > nfsd_file_gc(); > nfsd_file_schedule_laundrette(); > } > > Best Regards > Wang Yugui (wangyugui@xxxxxxxxxxxx) > 2022/05/29 -- Chuck Lever