Re: [PATCH 4/5] NFS: Further optimisations for 'ls -l'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-09-29 at 16:21 -0400, Benjamin Coddington wrote:
> On 29 Sep 2021, at 12:23, Trond Myklebust wrote:
> 
> > On Wed, 2021-09-29 at 10:56 -0400, Benjamin Coddington wrote:
> > > On 29 Sep 2021, at 9:49, trondmy@xxxxxxxxxx wrote:
> > > 
> > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > > > 
> > > > If a user is doing 'ls -l', we have a heuristic in GETATTR that
> > > > tells
> > > > the readdir code to try to use READDIRPLUS in order to refresh
> > > > the
> > > > inode
> > > > attributes. In certain cirumstances, we also try to invalidate
> > > > the
> > > > remaining directory entries in order to ensure this refresh.
> > > > 
> > > > If there are multiple readers of the directory, we probably
> > > > should
> > > > avoid
> > > > invalidating the page cache, since the heuristic breaks down in
> > > > that
> > > > situation anyway.
> > > 
> > > Hi Trond, I'm curious about the motivation for this work because
> > > we're often
> > > managing expectations about performance between various
> > > workloads. 
> > > What
> > > does the workload look like that prompted you to make this
> > > optimization?
> > > 
> > > I'm also interested because I have some work that improves
> > > readdir
> > > performance for multiple readers on large directories, but is
> > > rotting
> > > because things have been "good enough" for folks over here.
> > > 
> > 
> > Are you talking about this patch specifically, or the series in
> > general?
> 
> A bit of both - the first two patches didn't really make sense to me,
> but I
> get it now.  Thanks.
> 
> > The general motivation for the series is that in certain
> > circumstances
> > involving a read-only scenario (no changes to the directory) and
> > multiple processes we're seeing a regression w.r.t. older kernels.
> > 
> > The motivation for this particular patch is twofold:
> >    1. I want to get rid of the cached page_index in the inode and
> >       replace it with something less persistent (inodes are
> > forever,
> >       unlike file descriptors).
> >    2. The heuristic in question is designed to only work in the
> > single
> >       process/single threaded case.
> 
> Make sense to me now.
> 
> I'm wondering if this might be creating a READDIR op amplification
> for N
> readers sharing the directory's fd because one process can be
> dropping the
> cache, which artificially deflates desc->page_index for a given
> dir_cookie.
> That lower page_index gets reused on the next pass dropping the
> cache, and
> it'll keep using the bottom pages and doing READDIRs.  That wasn't
> possible
> before because we only invalidated past nfsi->page_index.
> 
> Maybe an unlikely case (but I think some http servers could behave
> this
> way), and I'd have to test it to be sure.  I can try to do that
> unless you
> think its not possible or concerning.
> 

It is concerning, and indeed in our test we are seeing READDIR
amplification with these multiple process accesses. So scenarios like
the one you describe above are exactly the kind of issue I was looking
to fix with these patches.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux