On Tue, Jan 21, 2025 at 7:07 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Tue, 2025-01-21 at 12:20 +1100, Dave Chinner wrote: > > On Mon, Jan 20, 2025 at 09:25:37AM +1100, NeilBrown wrote: > > > On Mon, 20 Jan 2025, Dave Chinner wrote: > > > > On Sat, Jan 18, 2025 at 12:06:30PM +1100, NeilBrown wrote: > > > > > > > > > > My question to fs-devel is: is anyone willing to convert their fs (or > > > > > advice me on converting?) to use the new interface and do some testing > > > > > and be open to exploring any bugs that appear? > > > > > > > > tl;dr: You're asking for people to put in a *lot* of time to convert > > > > complex filesystems to concurrent directory modifications without > > > > clear indication that it will improve performance. Hence I wouldn't > > > > expect widespread enthusiasm to suddenly implement it... > > > > > > Thanks Dave! > > > Your point as detailed below seems to be that, for xfs at least, it may > > > be better to reduce hold times for exclusive locks rather than allow > > > concurrent locks. That seems entirely credible for a local fs but > > > doesn't apply for NFS as we cannot get a success status before the > > > operation is complete. > > > > How is that different from a local filesystem? A local filesystem > > can't return from open(O_CREAT) with a struct file referencing a > > newly allocated inode until the VFS inode is fully instantiated (or > > failed), either... > > > > i.e. this sounds like you want concurrent share-locked dirent ops so > > that synchronously processed operations can be issued concurrently. > > > > Could the NFS client implement asynchronous directory ops, keeping > > track of the operations in flight without needing to hold the parent > > i_rwsem across each individual operation? This basically what I've > > been describing for XFS to minimise parent dir lock hold times. > > > > Yes, basically. The protocol and NFS client have no requirement to > serialize directory operations. We'd be happy to spray as many at the > server in parallel as we can get away with. We currently don't do that > today, largely because the VFS prohibits it. > > The NFS server, or exported filesystem may have requirements that > serialize these operations though. > > > What would VFS support for that look like? Is that of similar > > complexity to implementing shared locking support so that concurrent > > blocking directory operations can be issued? Is async processing a > > better model to move the directory ops towards so we can tie > > userspace directly into it via io_uring? > > > > Given that the VFS requires an exclusive lock today for directory > morphing ops, moving to a model where we can take a shared lock on the > directory instead seems like a nice, incremental approach to dealing > with this problem. > > That said, I get your objection. Not being able to upgrade a rwsem > makes that shared lock kind of nasty for filesystems that actually do > rely on it for some parts of their work today. > > The usual method of dealing with that would be to create a new XFS-only > per-inode lock that would take over that serialization. The nice thing > there is that you could (over time) reduce its scope. > > > > So it seems likely that different filesystems > > > will want different approaches. No surprise. > > > > > > There is some evidence that ext4 can be converted to concurrent > > > modification without a lot of work, and with measurable benefits. I > > > guess I should focus there for local filesystems. > > > > > > But I don't want to assume what is best for each file system which is > > > why I asked for input from developers of the various filesystems. This is an interesting question for SMB3.1.1 as well (cifs.ko etc.), especially in workloads where the client already has a directory lease on the directory being updated (and with multichannel there could be quite a bit of i/o in flight), but even without a directory lease there is no restriction on sending multiple simultaneous directory related requests to the server -- Thanks, Steve