On Sat, 27 Aug 2022, Al Viro wrote: > On Fri, Aug 26, 2022 at 12:06:55PM -0700, Linus Torvalds wrote: > > > Because right now I think the main reason we cannot move the lock into > > the filesystem is literally that we've made the lock cover not just > > the filesystem part, but the "lookup and create dentry" part too. > > How about rename loop prevention? mount vs. rmdir? d_splice_alias() > fun on tree reconnects? Thanks for this list. I think the "mount vs. rmdir" usage of inode_lock() is independent of the usage for directory operations, so we can change the latter as much as we like without materially affecting the former. The lock we take on the directory being removed DOES ensure no new objects are linked into the directory, so for that reason we still need at least a shared lock when adding links to a directory. Moving that lock into the filesystem would invert the locking order in rmdir between the child being removed and the parent being locked. That would require some consideration. d_splice_alias() happens at ->lookup time so it is already under a shared lock. I don't see that it depends on i_rwsem - it uses i_lock for the important locking. Rename loop prevention is largely managed by s_vfs_rename_mutex. Once that is taken, nothing can be moved to a different directory. That means 'trap' will keep any relationship it had to new_path and old_path. It could be renamed within it's parent, but as long as it isn't removed the comparisons with old_dentry and new_dentry should still be reliable. As 'trap' clearly isn't empty, we trust that the filesystem won't allow an rmdir to succeed. What have I missed? Thanks, NeilBrown > > > But once you have that "DCACHE_PAR_LOOKUP" bit and the > > d_alloc_parallel() logic to serialize a _particular_ dentry being > > created (as opposed to serializing all the sleeping ops to that > > directly), I really think we should strive to move the locking - that > > no longer helps the VFS dcache layer - closer to the filesystem call > > and eventually into it. > > See above. >