On 21 Feb 2022, at 14:58, Trond Myklebust wrote: > On Mon, 2022-02-21 at 11:45 -0500, Benjamin Coddington wrote: >> On 21 Feb 2022, at 11:08, trondmy@xxxxxxxxxx wrote: >> >>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> >>> When reading a very large directory, we want to try to keep the >>> page >>> cache up to date if doing so is inexpensive. Right now, we will try >>> to >>> refill the page cache if it is non-empty, irrespective of whether >>> or not >>> doing so is going to take a long time. >>> >>> Replace that algorithm with something that looks at how many times >>> we've >>> refilled the page cache without seeing a cache hit. >> >> Hi Trond, I've been following your work here - thanks for it. >> >> I'm wondering if there might be a regression on this patch for the >> case >> where two or more directory readers are part way through a large >> directory >> when the pagecache is truncated. If I'm reading this correctly, >> those >> readers will stop caching after 5 fills and finish the remainder of >> their >> directory reads in the uncached mode. >> >> Isn't there an OP amplification per reader in this case? >> > > Depends... In the old case, we basically stopped doing uncached readdir > if a third process starts filling the page cache again. In particular, > this means we were vulnerable to restarting over and over once page > reclaim starts to kick in for very large directories. > > In this new one, we have each process give it a try (5 fills each), and > then fallback to uncached. Yes, there will be corner cases where this > will perform less well than the old algorithm, but it should also be > more deterministic. > > I am open to suggestions for better ways to determine when to cut over > to uncached readdir. This is one way, that I think is better than what > we have, however I'm sure it can be improved upon. I still have old patches that allow each page to be "versioned" with the change attribute, page_index, and cookie. This allows the page cache to be culled page-by-page, and multiple fillers can continue to fill pages at "headless" page offsets that match their original cookie and page_index pair. This change would mean readers don't have to start over filling the page cache when the cache is dropped, so we wouldn't need to worry about when to cut over to the uncached mode - it makes the problem go away. I felt there wasn't much interest in this work, and our most vocal customer was happy enough with last winter's readdir improvements (thanks!) that I didn't follow up, but I can refresh those patches and send them along again. Ben