Re: directory caching & negative file lookups?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2024-04-12 at 12:43 +0100, Daire Byrne wrote:
> On Fri, 12 Apr 2024 at 11:21, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > 
> > On Fri, 2024-04-12 at 10:11 +0100, Daire Byrne wrote:
> > > Thanks for the clarity Trond - I promise not to forget this time and
> > > ask the same question again in 2 years!
> > > 
> > > It just keeps coming up here at DNEG due to accessing software over
> > > NFS and crazy PYTHONPATH usage by some of our developers. In some
> > > cases, there are 57,000 negative lookups but only 5000 positive
> > > lookups (and opens)!
> > > 
> > > Getting devs to optimise their code is my cross to bear I guess.
> > > 
> > > But this is also a well known and common problem for large batch farms
> > > and there are some novel workarounds out there:
> > > 
> > > https://guix.gnu.org/en/blog/2021/taming-the-stat-storm-with-a-loader-cache
> > > https://computing.llnl.gov/projects/spindle
> > > https://cernvm.cern.ch/fs/
> > > 
> > > Coupled with our propensity for high latency (~100ms) NFS via
> > > re-export servers (for "cloud rendering"), these inefficient path
> > > lookups quickly become a killer - the application takes longer to
> > > lookup non-existent files and open files, than it does to execute to
> > > completion. We use aggressive caching (actimeo=3600,nocto,vers=3) and
> > > "preload" metadata ops (ls -l, open) on a regular basis to try and
> > > keep things in (re-export) client cache which certainly helps. It's
> > > hard to keep known (expensive) metadata worksets in memory.
> > > 
> > > I've also been looking at using an overlay and hand crafting whiteout
> > > files in the upper layers to essentially block known negative lookups
> > > from hitting the lower NFS share - again, only useful and correct for
> > > read-only software shares.
> > > 
> > > I wonder if Jeff Layton's directory delegations will help for
> > > (read-only) metadata heavy lookups over the WAN?
> > > 
> > 
> > Probably not. In order to optimize away lookups of negative dentries
> > that aren't in cache, you need to know all of the positive dentries in
> > the directory. As Trond pointed out earlier in the discussion, NFS
> > doesn't have a concept of directory "completeness", so we can't
> > reasonably do this.
> > 
> > FWIW, CephFS does have such a concept and can satisfy readdir requests
> > and negative lookups out of the cache when it has complete directory
> > info.
> 
> Out of interest, do directory delegations help with positive lookups
> or repeat opens? They may be less numerous in our badly behaved
> workloads, but they are still nice to optimise for latency.
> 
> Can you disable "cto" for example if you have a directory delegation
> and repeatedly open the same file for reading without a network hop?

Maybe? Dir delegations don't really help with CTO, since that's all
about the file itself, not its parent directory. It might help avoid
having to revalidate the parent directory for the lookup however.

FWIW, basic, recallable directory delegations with no notifications are
pretty useless in my testing. You optimize away a few GETATTRs on the
parent directories, but those are pretty infrequent anyway -- 1 every
60s or so on directories that aren't changing much by default.

That's close to "why bother" territory, but maybe there is a case to be
made for that on high-latency links (like you mention).

Mixing in notifications may change things though:

Consider 2 clients that are both working with files in the same
directory and both hold directory delegations. client1 creates a file or
another directory in the dir. Server then pushes out a notification to
client2. client2 goes to look up the new dentry later, and finds that
it's already in cache.

That's a potential optimization, but it's pretty specific to workloads
where multiple clients are operating on the same files in the a
directory that is frequently changing.

> 
> I also noticed that "nocto" can completely stop any subsequent network
> hops for opens (with a long actimeo) for NFSv3, but on NFSv4 it only
> cuts a single GETATTR before still doing an OPEN DH over the network
> each time.
> 

File delegations can allow you to do an open w/o having to cross the
network. If I hold the right sort of deleg on a file, I should be able
to open it without talking to the server.

Dir delegations could help optimize away some round trips for the
lookups leading up to the open however.

> I'm probably wandering off into "disconnected clients" and AFS style
> territory now...
> 
> 

> 
> > > On Fri, 5 Apr 2024 at 16:03, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > On Fri, 2024-04-05 at 15:47 +0100, Daire Byrne wrote:
> > > > > Apologies for dragging up an old thread, but I've had to tackle
> > > > > wayward negative lookup storms again and I have obviously half
> > > > > forgotten what I learned in this thread last time (even after
> > > > > re-reading it!).
> > > > > 
> > > > > Can I just ask if I understand correctly and that there was an
> > > > > intention a long time ago to be able to serve negative dentries from
> > > > > a
> > > > > "complete" READDIRPLUS result?
> > > > > 
> > > > > https://www.cs.helsinki.fi/linux/linux-kernel/2002-30/0108.html
> > > > > 
> > > > > So if we did a readdirplus on a directory then immediately fired
> > > > > random non existent lookups at the directory, it could be served from
> > > > > the readdirplus result? i.e. not in readdir result, then return
> > > > > ENOENT
> > > > > without needing to ask server?
> > > > > 
> > > > > But that is not the case today because you can't track the
> > > > > "completeness" of a READDIRPLUS result for a directory over time (in
> > > > > page cache)? Or is it all due to needing to deal with case
> > > > > insensitive
> > > > > filesystems (which I would think effects positive lookups too)?
> > > > > 
> > > > > I did try to decipher the v6.6 fs/nfs/dir.c READDIR bits but I
> > > > > quickly
> > > > > got lost...
> > > > > 
> > > > > Cheers,
> > > > > 
> > > > > Daire
> > > > 
> > > > If the question is whether the client trusts that a READDIR call to the
> > > > server returns all the names that can be successfully looked up, then
> > > > the answer is "no".
> > > > It's not even a question of case sensitivity. There are plenty of
> > > > servers out there that will allow you to look up names that won't ever
> > > > appear in the results of a READDIR (or READDIRPLUS) call. Having a
> > > > hidden ".snapshot" directory is, for instance, a popular way to present
> > > > snapshots.
> > > > 
> > > > So no, we're not ever going to implement any negative dentry cache
> > > > scheme that relies on READDIR/READDIRPLUS.
> > > > --
> > > > Trond Myklebust
> > > > Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@xxxxxxxxxxxxxxx
> > > > 
> > > > 
> > > 
> > 
> > --
> > Jeff Layton <jlayton@xxxxxxxxxx>

-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux