Hi, > > > On May 28, 2022, at 10:32 PM, Wang Yugui <wangyugui@xxxxxxxxxxxx> wrote: > > > > Hi, > > > >>> On May 27, 2022, at 4:37 PM, Frank van der Linden <fllinden@xxxxxxxxxx> wrote: > >>> > >>> On Fri, May 27, 2022 at 06:59:47PM +0000, Chuck Lever III wrote: > >>>> > >>>> Hi Frank- > >>>> > >>>> Bruce recently reminded me about this issue. Is there a bugzilla somewhere? > >>>> Do you have a reproducer I can try? > >>> > >>> Hi Chuck, > >>> > >>> The easiest way to reproduce the issue is to run generic/531 over an > >>> NFSv4 mount, using a system with a larger number of CPUs on the client > >>> side (or just scaling the test up manually - it has a calculation based > >>> on the number of CPUs). > >>> > >>> The test will take a long time to finish. I initially described the > >>> details here: > >>> > >>> https://lore.kernel.org/linux-nfs/20200608192122.GA19171@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > >>> > >>> Since then, it was also reported here: > >>> > >>> https://lore.kernel.org/all/20210531125948.2D37.409509F4@xxxxxxxxxxxx/T/#m8c3e4173696e17a9d5903d2a619550f352314d20 > >> > >> Thanks for the summary. So, there isn't a bugzilla tracking this > >> issue? If not, please create one here: > >> > >> https://bugzilla.linux-nfs.org/ > >> > >> Then we don't have to keep asking for a repeat summary ;-) > >> > >> > >>> I posted an experimental patch, but it's actually not quite correct > >>> (although I think the idea behind it is makes sense): > >>> > >>> https://lore.kernel.org/linux-nfs/20201020183718.14618-4-trondmy@xxxxxxxxxx/T/#m869aa427f125afee2af9a89d569c6b98e12e516f > >> > >> A handful of random comments: > >> > >> - nfsd_file_put() is now significantly different than it used > >> to be, so that part of the patch will have to be updated in > >> order to apply to v5.18+ > > > > When many open files(>NFSD_FILE_LRU_THRESHOLD), > > nfsd_file_gc() will waste many CPU times. > > Thanks for the suggestion. I agree that CPU and > memory bandwidth are not being used effectively > by the filecache garbage collector. > > > > Can we serialize nfsd_file_gc() for v5.18+ as a first step? > > If I understand Frank's problem statement correctly, > garbage collection during an nfsd_file_put() under > an NFSv4-only workload walks the length of the LRU > list and finds nothing to evict 100% of the time. > That seems like a bug, and fixing it might give us > the most improvement in this area. > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index 3d944ca..6abefd9 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -63,6 +63,8 @@ static struct delayed_work nfsd_filecache_laundrette; > > > > static void nfsd_file_gc(void); > > > > +static atomic_t nfsd_file_gc_queue_delayed = ATOMIC_INIT(0); > > + > > static void > > nfsd_file_schedule_laundrette(void) > > { > > @@ -71,8 +73,10 @@ nfsd_file_schedule_laundrette(void) > > if (count == 0 || test_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags)) > > return; > > > > - queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > > + if(atomic_cmpxchg(&nfsd_file_gc_queue_delayed, 0, 1)==0){ > > + queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > > NFSD_LAUNDRETTE_DELAY); > > I might misunderstand what this is doing exactly. > I'm sure there's a preferred idiom in the kernel > for not queuing a new work item when one is already > running, so that an open-coded cmpxchg is not > necessary. Thanks. I dropped queue_delayed_work related change and reposted it. > It might be better to allocate a specific workqueue > for filecache garbage collection, and limit the > maximum number of active work items allowed on that > queue to 1. One benefit of that might be reducing > the number of threads contending for the filecache > data structures. In this way, the number of threads of filecache garbage collection is reduced. but filecache is still accessed by all nfsd threads for filecache fetch/save. Best Regards Wang Yugui (wangyugui@xxxxxxxxxxxx) 2022/05/30 > If GC is capped like this, maybe create one GC > workqueue per nfsd_net so GC activity in one > namespace does not starve filecache GC in other > namespaces. > > Before I would take patches like this, though, > performance data demonstrating a problem and some > improvement should be presented separately or as > part of the patch descriptions. > If you repost, start a separate thread and cc: > > M: Tejun Heo <tj@xxxxxxxxxx> > R: Lai Jiangshan <jiangshanlai@xxxxxxxxx> > > to get review by workqueue domain experts. > >