On Mon, 2025-02-24 at 14:09 +1100, NeilBrown wrote: > On Mon, 24 Feb 2025, Al Viro wrote: > > On Mon, Feb 24, 2025 at 12:34:06PM +1100, NeilBrown wrote: > > > On Sat, 22 Feb 2025, Al Viro wrote: > > > > On Fri, Feb 21, 2025 at 10:36:30AM +1100, NeilBrown wrote: > > > > > > > > > +In general, filesystems which use d_instantiate_new() to > > > > > install the new > > > > > +inode can safely return NULL. Filesystems which may not > > > > > have an I_NEW inode > > > > > +should use d_drop();d_splice_alias() and return the result > > > > > of the latter. > > > > > > > > IMO that's a bad pattern, _especially_ if you want to go for > > > > "in-update" > > > > kind of stuff later. > > > > > > Agreed. I have a draft patch to change d_splice_alias() and > > > d_exact_alias() to work on hashed dentrys. I thought it should > > > go after > > > these mkdir patches rather than before. > > > > Could you give a braindump on the things d_exact_alias() is needed > > for? > > It's a recurring headache when doing ->d_name/->d_parent audits; > > see e.g. > > https://lore.kernel.org/all/20241213080023.GI3387508@ZenIV/ for > > related > > mini-rant from the latest iteration. > > > > Proof of correctness is bloody awful; it feels like the primitive > > itself > > is wrong, but I'd never been able to write anything concise > > regarding > > the things we really want there ;-/ > > > > As I understand it, it is needed (or wanted) to handle the > possibility > of an inode becoming "stale" and then recovering. This could happen, > for example, with a temporarily misconfigured NFS server. > > If ->d_revalidate gets a NFSERR_STALE from the server it will return > '0' > so lookup_fast() and others will call d_invalidate() which will > d_drop() > the dentry. There are other paths on which -ESTALE can result in > d_drop(). > > If a subsequent attempt to "open" the name successfully finds the > same > inode we want to reuse the old dentry rather than create a new one. > > I don't really understand why. This code was added 20 years ago > before > git. > It was introduced by > > commit 89a45174b6b32596ea98fa3f89a243e2c1188a01 > Author: Trond Myklebust <trond.myklebust@xxxxxxxxxx> > Date: Tue Jan 4 21:41:37 2005 +0100 > > VFS: Avoid dentry aliasing problems in filesystems like NFS, > where > inodes may be marked as stale in one instance (causing the > dentry > to be dropped) then re-enabled in the next instance. > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxx> > > in history.git > > Trond: do you have any memory of this? Can you explain what the > symptom > was that you wanted to fix? > > The original patch used d_add_unique() for lookup and atomic_open and > readdir prime-dcache. We now only use it for v4 atomic_open. Maybe > we > don't need it at all? Or maybe we need to restore it to those other > callers? > 2005? That looks like it was trying to deal with the userspace NFS server. I can't remember when it was given the ability to use the inode generation counter, but I'm pretty sure that in 2005 there were plenty of setups out there that had the older version that reused filehandles (due to inode number reuse). So you would get spurious ESTALE errors sometimes due to inode number reuse, sometimes because the filehandle fell out of the userspace NFS server's cache. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx