Re: [PATCH v9 23/27] NFS: Convert readdir page cache to use a cookie based index

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Thu, 10 Mar 2022 21:07:41 +0000

On Wed, 2022-03-09 at 15:01 -0500, Benjamin Coddington wrote:
> On 27 Feb 2022, at 18:12, trondmy@xxxxxxxxxx wrote:
> 
> > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > 
> > Instead of using a linear index to address the pages, use the
> > cookie of
> > the first entry, since that is what we use to match the page
> > anyway.
> > 
> > This allows us to avoid re-reading the entire cache on a seekdir()
> > type
> > of operation. The latter is very common when re-exporting NFS, and
> > is a
> > major performance drain.
> > 
> > The change does affect our duplicate cookie detection, since we can
> > no
> > longer rely on the page index as a linear offset for detecting
> > whether
> > we looped backwards. However since we no longer do a linear search
> > through all the pages on each call to nfs_readdir(), this is less
> > of a
> > concern than it was previously.
> > The other downside is that invalidate_mapping_pages() no longer can
> > use
> > the page index to avoid clearing pages that have been read. A
> > subsequent
> > patch will restore the functionality this provides to the 'ls -l'
> > heuristic.
> 
> I didn't realize the approach was to also hash out the linearly-
> cached
> entries.  I thought we'd do something like flag the context for
> hashed page
> indexes after a seekdir event, and if there are collisions with the
> linear
> entries, they'll get fixed up when found.

Why? What's the point of using 2 models where 1 will do?

> 
> Doesn't that mean that with this approach seekdir() only hits the
> same pages
> when the entry offset is page-aligned?  That's 1 in 127 odds.

The point is not to stomp all over the pages that contain aligned data
when the application does call seekdir().

IOW: we always optimise for the case where we do a linear read of the
directory, but we support random seekdir() + read too.

> 
> It also means we're amplifying the pagecache's useage for slightly
> changing
> directories - rather than re-using the same pages we're scattering
> our usage
> across the index.  Eh, maybe not a big deal if we just expect the
> page
> cache's LRU to do the work.
> 

I don't understand your point about 'not reusing'. If the user seeks to
the same cookie, we reuse the page. However I don't know how you would
go about setting up a schema that allows you to seek to an arbitrary
cookie without doing a linear search.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx