Re: [PATCH v6 05/13] NFS: Improve algorithm for falling back to uncached readdir

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 23 Feb 2022 12:17:04 +0000

On Tue, 2022-02-22 at 15:21 -0500, Benjamin Coddington wrote:
> On 22 Feb 2022, at 15:11, Trond Myklebust wrote:
> 
> > On Tue, 2022-02-22 at 07:50 -0500, Benjamin Coddington wrote:
> > > On 21 Feb 2022, at 18:20, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2022-02-21 at 16:10 -0500, Benjamin Coddington wrote:
> > > > > On 21 Feb 2022, at 15:55, Trond Myklebust wrote:
> > > > > > 
> > > > > > We will always need the ability to cut over to uncached
> > > > > > readdir.
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > If the cookie is no longer returned by the server because
> > > > > > one
> > > > > > or more
> > > > > > files were deleted then we need to resolve the situation
> > > > > > somehow (IOW:
> > > > > > the 'rm *' case). The new algorithm _does_ improve
> > > > > > performance
> > > > > > on those
> > > > > > situations, because it no longer requires us to read the
> > > > > > entire
> > > > > > directory before switching over: we try 5 times, then fail
> > > > > > over.
> > > > > 
> > > > > Yes, using per-page validation doesn't remove the need for
> > > > > uncached
> > > > > readdir.  It does allow a reader to simply resume filling the
> > > > > cache where
> > > > > it left off.  There's no need to try 5 times and fail over. 
> > > > > And
> > > > > there's
> > > > > no need to make a trade-off and make the situation worse in
> > > > > certain
> > > > > scenarios.
> > > > > 
> > > > > I thought I'd point that out and make an offer to re-submit
> > > > > it. 
> > > > > Any
> > > > > interest?
> > > > > 
> > > > 
> > > > As I recall, I had concerns about that approach. Can you
> > > > explain
> > > > again
> > > > how it will work?
> > > 
> > > Every page of readdir results has the directory's change attr
> > > stored
> > > on the
> > > page.  That, along with the page's index and the first cookie is
> > > enough
> > > information to determine if the page's data can be used by
> > > another
> > > process.
> > > 
> > > Which means that when the pagecache is dropped, fillers don't
> > > have to
> > > restart
> > > filling the cache at page index 0, they can continue to fill at
> > > whatever
> > > index they were at previously.  If another process finds a page
> > > that
> > > doesn't
> > > match its page index, cookie, and the current directory's change
> > > attr, the
> > > page is dropped and refilled from that process' indexing.
> > > 
> > > > A few of the concerns I have revolve around
> > > > telldir()/seekdir(). If
> > > > the
> > > > platform is 32-bit, then we cannot use cookies as the telldir()
> > > > output,
> > > > and so our only option is to use offsets into the page cache
> > > > (this
> > > > is
> > > > why this patch carves out an exception if desc->dir_cookie ==
> > > > 0).
> > > > How
> > > > would that work with you scheme?
> > > 
> > > For 32-bit seekdir, pages are filled starting at index 0.  This
> > > is
> > > very
> > > unlikely to match other readers (unless they also do the _same_
> > > seekdir).
> > > 
> > > > Even in the 64-bit case where are able to use cookies for
> > > > telldir()/seekdir(), how do we determine an appropriate page
> > > > index
> > > > after a seekdir()?
> > > 
> > > We don't.  Instead we start filling at index 0.  Again, the
> > > pagecache
> > > will
> > > only be useful to other processes that have done the same llseek.
> > > 
> > > This approach optimizes the pagecache for processes that are
> > > doing
> > > similar
> > > work, and has the added benefit of scaling well for large
> > > directory
> > > listings
> > > under memory pressure.  Also a number of classes of directory
> > > modifications
> > > (such as renames, or insertions/deletions at locations a reader
> > > has
> > > moved
> > > beyond) are no longer a reason to re-fill the pagecache from
> > > scratch.
> > > 
> > 
> > OK, you've got me more or less sold on it.
> > 
> > I'd still like to figure out how to improve the performance for
> > seekdir
> > (since I do have an interest in re-exporting NFS) but I've been
> > playing
> > around with a couple of patches that implement your concept and
> > they do
> > seem to work well for the common case of a linear read through the
> > directory.
> 
> Nice.  I have another version from the one I originally posted:
> https://lore.kernel.org/linux-nfs/cover.1611160120.git.bcodding@xxxxxxxxxx/
> 
> .. but I don't remember exactly the changes and it needs rebasing. 
> Should I
> rebase it against your testing branch and send the result?

My 2 patches did something slightly different to yours, storing the
change attribute in the array header instead of in page_private. That
makes for a slightly smaller change.

Having further slept on the idea, I think I know how to solve the
seekdir() issue, at least for the cases where we can use cookies as
telldir()/seekdir() offsets. If we ditch the linear index, and instead
use a hash of the desc->last_cookie as the index, then we can make
random access to the directories work with your idea.
There are a couple of further problems with this concept, but I think
those are solvable. The issues I have identified so far are:

 * invalidate_inode_pages2_range() no longer works to clear out the
   page cache in a predictable way for the readdirplus heuristic.
 * duplicate cookie detection needs to be changed.

I think those are solvable problems.

-- 
Trond Myklebust Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx