On Wed, 2021-09-29 at 16:21 -0400, Benjamin Coddington wrote: > On 29 Sep 2021, at 12:23, Trond Myklebust wrote: > > > On Wed, 2021-09-29 at 10:56 -0400, Benjamin Coddington wrote: > > > On 29 Sep 2021, at 9:49, trondmy@xxxxxxxxxx wrote: > > > > > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > > > > > If a user is doing 'ls -l', we have a heuristic in GETATTR that > > > > tells > > > > the readdir code to try to use READDIRPLUS in order to refresh > > > > the > > > > inode > > > > attributes. In certain cirumstances, we also try to invalidate > > > > the > > > > remaining directory entries in order to ensure this refresh. > > > > > > > > If there are multiple readers of the directory, we probably > > > > should > > > > avoid > > > > invalidating the page cache, since the heuristic breaks down in > > > > that > > > > situation anyway. > > > > > > Hi Trond, I'm curious about the motivation for this work because > > > we're often > > > managing expectations about performance between various > > > workloads. > > > What > > > does the workload look like that prompted you to make this > > > optimization? > > > > > > I'm also interested because I have some work that improves > > > readdir > > > performance for multiple readers on large directories, but is > > > rotting > > > because things have been "good enough" for folks over here. > > > > > > > Are you talking about this patch specifically, or the series in > > general? > > A bit of both - the first two patches didn't really make sense to me, > but I > get it now. Thanks. > > > The general motivation for the series is that in certain > > circumstances > > involving a read-only scenario (no changes to the directory) and > > multiple processes we're seeing a regression w.r.t. older kernels. > > > > The motivation for this particular patch is twofold: > > 1. I want to get rid of the cached page_index in the inode and > > replace it with something less persistent (inodes are > > forever, > > unlike file descriptors). > > 2. The heuristic in question is designed to only work in the > > single > > process/single threaded case. > > Make sense to me now. > > I'm wondering if this might be creating a READDIR op amplification > for N > readers sharing the directory's fd because one process can be > dropping the > cache, which artificially deflates desc->page_index for a given > dir_cookie. > That lower page_index gets reused on the next pass dropping the > cache, and > it'll keep using the bottom pages and doing READDIRs. That wasn't > possible > before because we only invalidated past nfsi->page_index. > > Maybe an unlikely case (but I think some http servers could behave > this > way), and I'd have to test it to be sure. I can try to do that > unless you > think its not possible or concerning. > It is concerning, and indeed in our test we are seeing READDIR amplification with these multiple process accesses. So scenarios like the one you describe above are exactly the kind of issue I was looking to fix with these patches. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx