Re: [PATCH v6 05/13] NFS: Improve algorithm for falling back to uncached readdir

"Benjamin Coddington" <bcodding@xxxxxxxxxx> · Tue, 22 Feb 2022 15:21:28 -0500

On 22 Feb 2022, at 15:11, Trond Myklebust wrote:

> On Tue, 2022-02-22 at 07:50 -0500, Benjamin Coddington wrote:
>> On 21 Feb 2022, at 18:20, Trond Myklebust wrote:
>>
>>> On Mon, 2022-02-21 at 16:10 -0500, Benjamin Coddington wrote:
>>>> On 21 Feb 2022, at 15:55, Trond Myklebust wrote:
>>>>>
>>>>> We will always need the ability to cut over to uncached
>>>>> readdir.
>>>>
>>>> Yes.
>>>>
>>>>> If the cookie is no longer returned by the server because one
>>>>> or more
>>>>> files were deleted then we need to resolve the situation
>>>>> somehow (IOW:
>>>>> the 'rm *' case). The new algorithm _does_ improve performance
>>>>> on those
>>>>> situations, because it no longer requires us to read the entire
>>>>> directory before switching over: we try 5 times, then fail
>>>>> over.
>>>>
>>>> Yes, using per-page validation doesn't remove the need for
>>>> uncached
>>>> readdir.  It does allow a reader to simply resume filling the
>>>> cache where
>>>> it left off.  There's no need to try 5 times and fail over.  And
>>>> there's
>>>> no need to make a trade-off and make the situation worse in
>>>> certain
>>>> scenarios.
>>>>
>>>> I thought I'd point that out and make an offer to re-submit it. 
>>>> Any
>>>> interest?
>>>>
>>>
>>> As I recall, I had concerns about that approach. Can you explain
>>> again
>>> how it will work?
>>
>> Every page of readdir results has the directory's change attr stored
>> on the
>> page.  That, along with the page's index and the first cookie is
>> enough
>> information to determine if the page's data can be used by another
>> process.
>>
>> Which means that when the pagecache is dropped, fillers don't have to
>> restart
>> filling the cache at page index 0, they can continue to fill at
>> whatever
>> index they were at previously.  If another process finds a page that
>> doesn't
>> match its page index, cookie, and the current directory's change
>> attr, the
>> page is dropped and refilled from that process' indexing.
>>
>>> A few of the concerns I have revolve around telldir()/seekdir(). If
>>> the
>>> platform is 32-bit, then we cannot use cookies as the telldir()
>>> output,
>>> and so our only option is to use offsets into the page cache (this
>>> is
>>> why this patch carves out an exception if desc->dir_cookie == 0).
>>> How
>>> would that work with you scheme?
>>
>> For 32-bit seekdir, pages are filled starting at index 0.  This is
>> very
>> unlikely to match other readers (unless they also do the _same_
>> seekdir).
>>
>>> Even in the 64-bit case where are able to use cookies for
>>> telldir()/seekdir(), how do we determine an appropriate page index
>>> after a seekdir()?
>>
>> We don't.  Instead we start filling at index 0.  Again, the pagecache
>> will
>> only be useful to other processes that have done the same llseek.
>>
>> This approach optimizes the pagecache for processes that are doing
>> similar
>> work, and has the added benefit of scaling well for large directory
>> listings
>> under memory pressure.  Also a number of classes of directory
>> modifications
>> (such as renames, or insertions/deletions at locations a reader has
>> moved
>> beyond) are no longer a reason to re-fill the pagecache from scratch.
>>
>
> OK, you've got me more or less sold on it.
>
> I'd still like to figure out how to improve the performance for seekdir
> (since I do have an interest in re-exporting NFS) but I've been playing
> around with a couple of patches that implement your concept and they do
> seem to work well for the common case of a linear read through the
> directory.

Nice.  I have another version from the one I originally posted:
https://lore.kernel.org/linux-nfs/cover.1611160120.git.bcodding@xxxxxxxxxx/

.. but I don't remember exactly the changes and it needs rebasing.  Should I
rebase it against your testing branch and send the result?

Ben