On Fri, 12 Apr 2024 at 11:21, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Fri, 2024-04-12 at 10:11 +0100, Daire Byrne wrote: > > Thanks for the clarity Trond - I promise not to forget this time and > > ask the same question again in 2 years! > > > > It just keeps coming up here at DNEG due to accessing software over > > NFS and crazy PYTHONPATH usage by some of our developers. In some > > cases, there are 57,000 negative lookups but only 5000 positive > > lookups (and opens)! > > > > Getting devs to optimise their code is my cross to bear I guess. > > > > But this is also a well known and common problem for large batch farms > > and there are some novel workarounds out there: > > > > https://guix.gnu.org/en/blog/2021/taming-the-stat-storm-with-a-loader-cache > > https://computing.llnl.gov/projects/spindle > > https://cernvm.cern.ch/fs/ > > > > Coupled with our propensity for high latency (~100ms) NFS via > > re-export servers (for "cloud rendering"), these inefficient path > > lookups quickly become a killer - the application takes longer to > > lookup non-existent files and open files, than it does to execute to > > completion. We use aggressive caching (actimeo=3600,nocto,vers=3) and > > "preload" metadata ops (ls -l, open) on a regular basis to try and > > keep things in (re-export) client cache which certainly helps. It's > > hard to keep known (expensive) metadata worksets in memory. > > > > I've also been looking at using an overlay and hand crafting whiteout > > files in the upper layers to essentially block known negative lookups > > from hitting the lower NFS share - again, only useful and correct for > > read-only software shares. > > > > I wonder if Jeff Layton's directory delegations will help for > > (read-only) metadata heavy lookups over the WAN? > > > > Probably not. In order to optimize away lookups of negative dentries > that aren't in cache, you need to know all of the positive dentries in > the directory. As Trond pointed out earlier in the discussion, NFS > doesn't have a concept of directory "completeness", so we can't > reasonably do this. > > FWIW, CephFS does have such a concept and can satisfy readdir requests > and negative lookups out of the cache when it has complete directory > info. Out of interest, do directory delegations help with positive lookups or repeat opens? They may be less numerous in our badly behaved workloads, but they are still nice to optimise for latency. Can you disable "cto" for example if you have a directory delegation and repeatedly open the same file for reading without a network hop? I also noticed that "nocto" can completely stop any subsequent network hops for opens (with a long actimeo) for NFSv3, but on NFSv4 it only cuts a single GETATTR before still doing an OPEN DH over the network each time. I'm probably wandering off into "disconnected clients" and AFS style territory now... Daire > > On Fri, 5 Apr 2024 at 16:03, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > On Fri, 2024-04-05 at 15:47 +0100, Daire Byrne wrote: > > > > Apologies for dragging up an old thread, but I've had to tackle > > > > wayward negative lookup storms again and I have obviously half > > > > forgotten what I learned in this thread last time (even after > > > > re-reading it!). > > > > > > > > Can I just ask if I understand correctly and that there was an > > > > intention a long time ago to be able to serve negative dentries from > > > > a > > > > "complete" READDIRPLUS result? > > > > > > > > https://www.cs.helsinki.fi/linux/linux-kernel/2002-30/0108.html > > > > > > > > So if we did a readdirplus on a directory then immediately fired > > > > random non existent lookups at the directory, it could be served from > > > > the readdirplus result? i.e. not in readdir result, then return > > > > ENOENT > > > > without needing to ask server? > > > > > > > > But that is not the case today because you can't track the > > > > "completeness" of a READDIRPLUS result for a directory over time (in > > > > page cache)? Or is it all due to needing to deal with case > > > > insensitive > > > > filesystems (which I would think effects positive lookups too)? > > > > > > > > I did try to decipher the v6.6 fs/nfs/dir.c READDIR bits but I > > > > quickly > > > > got lost... > > > > > > > > Cheers, > > > > > > > > Daire > > > > > > If the question is whether the client trusts that a READDIR call to the > > > server returns all the names that can be successfully looked up, then > > > the answer is "no". > > > It's not even a question of case sensitivity. There are plenty of > > > servers out there that will allow you to look up names that won't ever > > > appear in the results of a READDIR (or READDIRPLUS) call. Having a > > > hidden ".snapshot" directory is, for instance, a popular way to present > > > snapshots. > > > > > > So no, we're not ever going to implement any negative dentry cache > > > scheme that relies on READDIR/READDIRPLUS. > > > -- > > > Trond Myklebust > > > Linux NFS client maintainer, Hammerspace > > > trond.myklebust@xxxxxxxxxxxxxxx > > > > > > > > > > -- > Jeff Layton <jlayton@xxxxxxxxxx>