Recently I went looking through the ->d_flags locking (long story, that was a side branch in digging through the issues with Neil's locking scheme); results are interesting. All stores to dentry->d_flags are done by somebody who * holds a counting reference to dentry or * has found a counting reference in shared data structures, in conditions that guarantee that this reference won't go away or * has found dentry in alias list of some inode (under ->i_lock) * [fs/dcache.c only] is an LRU walker callback running into that dentry in LRU list or * [fs/dcache.c only] is owner of a shrink list running into dentry in that list or * [fs/dcache.c only] is a d_walk() callback running into that dentry with zero refcount (and moving it to shrink list) or * [fs/dcache.c only] is a d_walk() called by kill_litter_super() running into that dentry when unpinning theretofore persistent dentries; that can happen only during filesystem shutdown, when nobody else could be accessing it. The above guarantees that dentry won't disappear under us; another interesting thing is exclusion, and that's where the things get nasty. Most of the stores to dentry->d_flags are under dentry->d_lock. There are obvious exceptions on the allocation side (stores done to dentry that is not visible to anybody else), but aside of those there are two exceptions. One is d_set_d_op(), another - setting DCACHE_PAR_LOOKUP in d_alloc_parallel(). The former sets ->d_op and marks the presense of several methods (->d_hash(), ->d_compare(), ->d_revalidate(), ->d_weak_revalidate(), ->d_delete(), ->d_prune() and ->d_real()) in ->d_flags. It can't be done more than once and, if the filesystem has ->s_d_op set in its superblock, it is done by the constructor (__d_alloc()). That, obviously, falls under the "on allocation side"; so does another common case - d_alloc_pseudo() setting ->d_op to &anon_ops if __d_alloc() hadn't set it to ->s_d_op. Note that there's no barriers between the stores to ->d_op and ->d_flags in d_set_d_op(); for allocation time uses that's not a problem - fetches on another CPU would have to be preceded by finding the dentry in the first place, and barriers on the insertion into wherever it had been found would suffice. There are other callers of d_set_d_op(), though - one in simple_lookup() (again, if ->s_d_op is NULL) and the rest are all in procfs. Those are done with no locking whatsoever and dentry is *not* entirely invisible. It's mostly invisible, though - in all those cases dentry is * negative, and thus unreachable via the alias list of any inode * unhashed, and thus can't be found via dcache lookup * has the only direct reference held by the caller who has been holding it since it got allocated (i.e. it couldn't have been put into LRU or shrink lists either). That is enough to guarantee that nobody else will be doing stores to ->d_flags, so our store is safe even without ->d_lock. It is also fucking ugly and brittle... Moreover, the question about ->d_op vs ->d_flags ordering also needs to be dealt with - unlike the calls at allocation time, insertion into wherever it had been found does *not* order fetches past both stores - not if it had been inserted into that wherever before the call of d_set_d_op(). Thankfully, the set of methods present in dentry_operations ever fed to those late calls of d_set_d_op() is limited: ->d_revalidate, ->d_delete and, in one case, ->d_compare. ->d_delete() is easy - it's called only from one place (retain_dentry()) and there we have just dropped the last reference to dentry in question. Since the caller of d_set_d_op() is holding a reference, the usual barriers on ->d_reflock use are enough. ->d_compare() is a bit confusing - dentry it's getting as argument is not the one whose method it is. We have the parent (already observed to be positive) and we are checking if this child (with ->d_parent pointing to parent) matches the name we want to look up. We take parent's ->d_compare() and give it child dentry, snapshot of child's name and the name we want to match it against. We *CAN* get a dentry in the middle of d_set_d_op() passed to someone's ->d_compare() - d_alloc_parent() does that to check if there's an in-lookup dentry matching the name we want. But ->d_compare() comes from parent, and that had been already observed to be positive. Which means that barriers in __d_set_inode_and_type() (from d_splice_alias()) suffice. ->d_revalidate() is only called for dentries that had been found in dcache hash chains at some point, so there the barrier on insertion into hash (in __d_add() from d_splice_alias()) is enough. That covers d_set_d_op() callers; another exception is d_alloc_parallel() when it decides to insert a new dentry into in-lookup hash and marks it with DCACHE_PAR_LOOKUP. Also safe, since dentry is only visible in the parent's list of children, has positive refcount and had it all along, so nobody else would try to do a store to ->d_flags at the same time. In case it's not obvious from the above, I'm less than happy with the entire thing - it may be provably correct, but it's much too brittle. If nothing else, d_set_d_op() should be unexported. Do it to a hashed or, worse, a positive dentry and you are asking for serious trouble. Leaving it as a public API is a really bad idea. Something along the lines of d_splice_alias_ops(inode, dentry, ops) (not exported, until we get a convincing modular user) is worth doing; all procfs callers of d_set_d_op() follow it with d_splice_alias() pretty much immediately. And yes, that could be done under ->d_lock, eliminating that special case from the proof. As for the allocation-time uses... We could bloody well calculate the ->d_flags bits to go along with ->s_d_op and just use that; it's not just about getting rid of recalculating them for each dentry ever allocated on the filesystem in question, we could get rid of quite a few always_delete_dentry users while we are at it. Look: ->d_delete == always_delete_dentry (and DCACHE_OP_DELETE to go with it) is equivalent to DCACHE_DONTCACHE; the only place where we look at either is retain_dentry(), where we have this: // ->d_delete() might tell us not to bother, but that requires // ->d_lock; can't decide without it if (unlikely(d_flags & DCACHE_OP_DELETE)) { if (!locked || dentry->d_op->d_delete(dentry)) return false; } // Explicitly told not to bother if (unlikely(d_flags & DCACHE_DONTCACHE)) return false; The inner if turns into if (!locked || 1) return false; so for those DCACHE_DONTCACHE would be equivalent. And it could be put into the same "set those bits in all ->d_flags on that fs"; what's more, simple_lookup() doesn't need to set ->d_op at all - it can just set DCACHE_DONTCACHE in the unlikely case when it's not been already set. How about something along the lines of 0) add d_splice_alias_ops(inode, dentry, dops), have procfs switch to that. 1) provide set_default_d_op(superblock, dops), use it in place of assignments to ->s_d_op. Rename ->s_d_op to catch unconverted filesystems. Tree-wide, entirely mechanical. 2) split the calculation of d_flags bits into a separate helper, add ->s_d_flags, have set_default_d_op() calculate and set that, have __d_alloc() pick ->s_d_op and ->s_d_flags directly. 3) convert those who wish to move from use of always_delete_dentry to adding DCACHE_DONTCACHE into ->s_d_flags. For devpts, for example, that avoids the need of non-NULL ->d_op. 4) replace d_set_d_op() in simple_lookup() with if (unlikely(!(dentry->d_flags & DCACHE_DONTCACHE))) { spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_DONTCACHE; spin_unlock(&dentry->d_lock); } 5) Kill simple_dentry_operations - no users would be left 6) make d_set_d_op() static in fs/dcache.c I'll put together something along those lines and post it later today. With that done, the last remaining store to ->d_flags without ->d_lock would be in d_alloc_parallel(); it can be killed off (we could set it in new->d_flags before anyone sees that dentry), the only reason I'd left that to actual insertion into in-lookup hash chain is that DCACHE_PAR_LOOKUP currently corresponds to "it's in in-lookup hash, what would've been ->d_alias is occupied by that hash chain" and at some point I think we had final dput() scream bloody murder if it saw such a dentry. Hell knows... d_alloc_parallel() is going to be heavily affected by any locking reworks; what should be done with that wart depends upon the direction we take for that work. PS: turns out that set_default_d_op() is slightly more interesting than I expected - fuse has separate dentry_operations for its root dentry. I don't see the point, TBH - the only difference is that root one lacks * ->d_delete() (never even called for root dentry; it's only called if d_unhashed() is false) * ->d_revalidate() (never called for root) * ->d_automount() (not even looked at unless you have S_AUTOMOUNT set on the inode). What's wrong with using fuse_dentry_operations for root dentry? Am I missing something subtle here? Miklos?