Re: [PATCH v6 05/13] NFS: Improve algorithm for falling back to uncached readdir

"Benjamin Coddington" <bcodding@xxxxxxxxxx> · Mon, 21 Feb 2022 15:22:34 -0500

On 21 Feb 2022, at 14:58, Trond Myklebust wrote:

> On Mon, 2022-02-21 at 11:45 -0500, Benjamin Coddington wrote:
>> On 21 Feb 2022, at 11:08, trondmy@xxxxxxxxxx wrote:
>>
>>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>>>
>>> When reading a very large directory, we want to try to keep the
>>> page
>>> cache up to date if doing so is inexpensive. Right now, we will try
>>> to
>>> refill the page cache if it is non-empty, irrespective of whether
>>> or not
>>> doing so is going to take a long time.
>>>
>>> Replace that algorithm with something that looks at how many times
>>> we've
>>> refilled the page cache without seeing a cache hit.
>>
>> Hi Trond, I've been following your work here - thanks for it.
>>
>> I'm wondering if there might be a regression on this patch for the
>> case
>> where two or more directory readers are part way through a large
>> directory
>> when the pagecache is truncated.  If I'm reading this correctly,
>> those
>> readers will stop caching after 5 fills and finish the remainder of
>> their
>> directory reads in the uncached mode.
>>
>> Isn't there an OP amplification per reader in this case?
>>
>
> Depends... In the old case, we basically stopped doing uncached readdir
> if a third process starts filling the page cache again. In particular,
> this means we were vulnerable to restarting over and over once page
> reclaim starts to kick in for very large directories.
>
> In this new one, we have each process give it a try (5 fills each), and
> then fallback to uncached. Yes, there will be corner cases where this
> will perform less well than the old algorithm, but it should also be
> more deterministic.
>
> I am open to suggestions for better ways to determine when to cut over
> to uncached readdir. This is one way, that I think is better than what
> we have, however I'm sure it can be improved upon.

I still have old patches that allow each page to be "versioned" with the
change attribute, page_index, and cookie.  This allows the page cache to be
culled page-by-page, and multiple fillers can continue to fill pages at
"headless" page offsets that match their original cookie and page_index
pair.  This change would mean readers don't have to start over filling the
page cache when the cache is dropped, so we wouldn't need to worry about
when to cut over to the uncached mode - it makes the problem go away.

I felt there wasn't much interest in this work, and our most vocal customer
was happy enough with last winter's readdir improvements (thanks!) that I
didn't follow up, but I can refresh those patches and send them along again.

Ben