On Mon, 2022-02-21 at 11:45 -0500, Benjamin Coddington wrote: > On 21 Feb 2022, at 11:08, trondmy@xxxxxxxxxx wrote: > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > When reading a very large directory, we want to try to keep the > > page > > cache up to date if doing so is inexpensive. Right now, we will try > > to > > refill the page cache if it is non-empty, irrespective of whether > > or not > > doing so is going to take a long time. > > > > Replace that algorithm with something that looks at how many times > > we've > > refilled the page cache without seeing a cache hit. > > Hi Trond, I've been following your work here - thanks for it. > > I'm wondering if there might be a regression on this patch for the > case > where two or more directory readers are part way through a large > directory > when the pagecache is truncated. If I'm reading this correctly, > those > readers will stop caching after 5 fills and finish the remainder of > their > directory reads in the uncached mode. > > Isn't there an OP amplification per reader in this case? > Depends... In the old case, we basically stopped doing uncached readdir if a third process starts filling the page cache again. In particular, this means we were vulnerable to restarting over and over once page reclaim starts to kick in for very large directories. In this new one, we have each process give it a try (5 fills each), and then fallback to uncached. Yes, there will be corner cases where this will perform less well than the old algorithm, but it should also be more deterministic. I am open to suggestions for better ways to determine when to cut over to uncached readdir. This is one way, that I think is better than what we have, however I'm sure it can be improved upon. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx