Re: directory caching & negative file lookups?

Daire Byrne <daire@xxxxxxxx> · Fri, 12 Apr 2024 10:11:00 +0100

Thanks for the clarity Trond - I promise not to forget this time and
ask the same question again in 2 years!

It just keeps coming up here at DNEG due to accessing software over
NFS and crazy PYTHONPATH usage by some of our developers. In some
cases, there are 57,000 negative lookups but only 5000 positive
lookups (and opens)!

Getting devs to optimise their code is my cross to bear I guess.

But this is also a well known and common problem for large batch farms
and there are some novel workarounds out there:

https://guix.gnu.org/en/blog/2021/taming-the-stat-storm-with-a-loader-cache
https://computing.llnl.gov/projects/spindle
https://cernvm.cern.ch/fs/

Coupled with our propensity for high latency (~100ms) NFS via
re-export servers (for "cloud rendering"), these inefficient path
lookups quickly become a killer - the application takes longer to
lookup non-existent files and open files, than it does to execute to
completion. We use aggressive caching (actimeo=3600,nocto,vers=3) and
"preload" metadata ops (ls -l, open) on a regular basis to try and
keep things in (re-export) client cache which certainly helps. It's
hard to keep known (expensive) metadata worksets in memory.

I've also been looking at using an overlay and hand crafting whiteout
files in the upper layers to essentially block known negative lookups
from hitting the lower NFS share - again, only useful and correct for
read-only software shares.

I wonder if Jeff Layton's directory delegations will help for
(read-only) metadata heavy lookups over the WAN?

Daire

On Fri, 5 Apr 2024 at 16:03, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>
> On Fri, 2024-04-05 at 15:47 +0100, Daire Byrne wrote:
> > Apologies for dragging up an old thread, but I've had to tackle
> > wayward negative lookup storms again and I have obviously half
> > forgotten what I learned in this thread last time (even after
> > re-reading it!).
> >
> > Can I just ask if I understand correctly and that there was an
> > intention a long time ago to be able to serve negative dentries from
> > a
> > "complete" READDIRPLUS result?
> >
> > https://www.cs.helsinki.fi/linux/linux-kernel/2002-30/0108.html
> >
> > So if we did a readdirplus on a directory then immediately fired
> > random non existent lookups at the directory, it could be served from
> > the readdirplus result? i.e. not in readdir result, then return
> > ENOENT
> > without needing to ask server?
> >
> > But that is not the case today because you can't track the
> > "completeness" of a READDIRPLUS result for a directory over time (in
> > page cache)? Or is it all due to needing to deal with case
> > insensitive
> > filesystems (which I would think effects positive lookups too)?
> >
> > I did try to decipher the v6.6 fs/nfs/dir.c READDIR bits but I
> > quickly
> > got lost...
> >
> > Cheers,
> >
> > Daire
>
> If the question is whether the client trusts that a READDIR call to the
> server returns all the names that can be successfully looked up, then
> the answer is "no".
> It's not even a question of case sensitivity. There are plenty of
> servers out there that will allow you to look up names that won't ever
> appear in the results of a READDIR (or READDIRPLUS) call. Having a
> hidden ".snapshot" directory is, for instance, a popular way to present
> snapshots.
>
> So no, we're not ever going to implement any negative dentry cache
> scheme that relies on READDIR/READDIRPLUS.
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx
>
>