On Tue, Apr 02, 2019 at 05:48:24PM +0100, Al Viro wrote: > ->d_prune() must not drop/regain any of the locks held by caller. > It must _not_ free anything attached to dentry - that belongs > later in the shutdown sequence. If anything, I'm tempted to > make it take const struct dentry * as argument, just to make > that clear. > > No new (counted) references can be acquired by that point; > lockless dcache lookup might find our dentry a match, but > result of such lookup is not going to be legitimized - it's > doomed to be thrown out as stale. > > It really makes more sense as part of struct dentry lifecycle > description... > > [1] in theory, ->d_time might be changed by overlapping lockless > call of ->d_revalidate(). Up to filesystem - VFS doesn't touch > that field (and AFAICS only NFS uses it these days). One addition: ->d_prune() can overlap only with * lockless ->d_hash()/->d_compare() * lockless ->d_revalidate() * lockless ->d_manage() So it must not destroy anything used by those without an RCU delay. The same goes for ->d_release() - both the list of the things it can overlap with and requirements re RCU delays. In-tree ->d_prune() instances are fine and so's the majority of ->d_release(). However, autofs ->d_release() has something that looks like an RCU use-after-free waiting to happen: static void autofs_dentry_release(struct dentry *de) { struct autofs_info *ino = autofs_dentry_ino(de); struct autofs_sb_info *sbi = autofs_sbi(de->d_sb); pr_debug("releasing %p\n", de); if (!ino) return; ... autofs_free_ino(ino); } with autofs_free_ino() being straight kfree(). Which means that the lockless case of autofs_d_manage() can run into autofs_dentry_ino(dentry) getting freed right under it. And there we do have this reachable: int autofs_expire_wait(const struct path *path, int rcu_walk) { struct dentry *dentry = path->dentry; struct autofs_sb_info *sbi = autofs_sbi(dentry->d_sb); struct autofs_info *ino = autofs_dentry_ino(dentry); int status; int state; /* Block on any pending expire */ if (!(ino->flags & AUTOFS_INF_WANT_EXPIRE)) return 0; if (rcu_walk) return -ECHILD; the second check buggers off in lockless mode; the first one can be done in lockless mode just fine, so AFAICS we do have a problem there. Smells like we ought to make that kfree in autofs_free_ino() RCU-delayed... Ian, have you, by any chance, run into reports like that? Use-after-free or oopsen in autofs_expire_wait() and friends, that is... AFAICS, everything else is safe; however, looking through those has turned up a fishy spot in ceph_d_revalidate(): } else if (d_really_is_positive(dentry) && ceph_snap(d_inode(dentry)) == CEPH_SNAPDIR) { valid = 1; Again, lockless ->d_revalidate() is called without anything that would hold ->d_inode stable; the first part of the condition does not guarantee that we won't run into ceph_snap(NULL) and oops. Sure, compiler is almost certainly not going to reload here, but we ought to use d_inode_rcu(dentry) there. Sigh...