On Wed, 22 Jan 2025, Dave Chinner wrote: > > Anyone who has been following io_uring development should know all > these things about async processing already. There's a reason that > that infrastructure exists: async processing is more efficient and > faster than the concurrent synchronous processing model being > proposed here.... I understand that asynchronous is best. I think we are a long way from achieving that. I think shared locking is still a good step in that direction. Shared locking allows the exclusion to be pushed down into the filesystem to whatever extend the filesystem needs. That will be needed for an async approach too. We already have a hint of async in the dcache in that ->lookup() can complete without a result if an intent flag is set. The actually lookup might then happen any time before the intended operation completes. For NFS exclusive open, that lookup is combined with the create/open. For unlink (which doesn't have an intent flag yet) it could be combined with the nfs REMOVE operation (if that seemed like a good idea). Other filesystems could do other things. But this is just a hint of aysnc as yet. I imagine that in the longer term we could drop the i_rwsem completely for directories. The VFS would set up a locked dentry much like it does before ->lookup and then calls into the filesystem. The filesystem might do the op synchronously or might take note of what is needed and schedule the relevant changes or whatever. When the op finished it does clear_and_wake_up_bit() (or similar) after stashing the result ... somewhere. For synchronous operations like syscalls, an on-stack result struct would be passed which contains an error status and optionally a new dentry (if e.g. mkdir found it needed to splice in an existing dentry). For async operations io_uring would allocate the result struct and would store in it a callback function to be called after the clear_and_wake_up_bit(). Rather than using i_rwsem to block additions to a directory while it is being removed, we would lock the dentry (so no more locked children can be added) and wait for any locked children to be unlocked. There are doubtless details that I have missed but it is clear that to allow async dirops we need to remove the need for i_rwsem, and I think transitioning from exclusive to shared is a useful step in that direction. I'm almost tempted to add the result struct to the new _shared inode_operations that I want to add, but that would likely be premature. Thanks, NeilBrown