On Wed, 2022-03-09 at 15:01 -0500, Benjamin Coddington wrote: > On 27 Feb 2022, at 18:12, trondmy@xxxxxxxxxx wrote: > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > Instead of using a linear index to address the pages, use the > > cookie of > > the first entry, since that is what we use to match the page > > anyway. > > > > This allows us to avoid re-reading the entire cache on a seekdir() > > type > > of operation. The latter is very common when re-exporting NFS, and > > is a > > major performance drain. > > > > The change does affect our duplicate cookie detection, since we can > > no > > longer rely on the page index as a linear offset for detecting > > whether > > we looped backwards. However since we no longer do a linear search > > through all the pages on each call to nfs_readdir(), this is less > > of a > > concern than it was previously. > > The other downside is that invalidate_mapping_pages() no longer can > > use > > the page index to avoid clearing pages that have been read. A > > subsequent > > patch will restore the functionality this provides to the 'ls -l' > > heuristic. > > I didn't realize the approach was to also hash out the linearly- > cached > entries. I thought we'd do something like flag the context for > hashed page > indexes after a seekdir event, and if there are collisions with the > linear > entries, they'll get fixed up when found. Why? What's the point of using 2 models where 1 will do? > > Doesn't that mean that with this approach seekdir() only hits the > same pages > when the entry offset is page-aligned? That's 1 in 127 odds. The point is not to stomp all over the pages that contain aligned data when the application does call seekdir(). IOW: we always optimise for the case where we do a linear read of the directory, but we support random seekdir() + read too. > > It also means we're amplifying the pagecache's useage for slightly > changing > directories - rather than re-using the same pages we're scattering > our usage > across the index. Eh, maybe not a big deal if we just expect the > page > cache's LRU to do the work. > I don't understand your point about 'not reusing'. If the user seeks to the same cookie, we reuse the page. However I don't know how you would go about setting up a schema that allows you to seek to an arbitrary cookie without doing a linear search. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx