On Thu, Feb 10, 2022 at 06:19:09PM +0000, Daire Byrne wrote: > I had a quick attempt at updating Neil's patch for mainline but I > quickly got stuck and confused. It looks like fs/namei.c in particular > underwent major changes and refactoring from v5.7+. > > If there is ever any interest in updating this and getting it into > mainline, I'm more than willing to test it with production loads. I > just lack the skills to update it myself. > > It definitely solves a big problem for us, but I also suspect we may > be the only ones with this problem. It benefits anyone trying to do a lot of creates in a on an NFS filesystem where the network round trip time is significant. That doesn't seem so weird. And even if the case is a little weird, just having a case and clear numbers to show the improvement is a big help. I haven't had the chance to read Neil's patch or work out what the issue with the namei changes. Al Viro is the expert on VFS locking. I was sure I'd seen him speculate about what would be needed to make parallel directory modifications possible, but I spent some time mining old mail and didn't find that. I think the path forward would be to update Neil's patch, add your performance data, send it to Al and linux-fsdevel, and see if we can get some idea what remains to be done to get this right. --b. > > Cheers, > > Daire > > > On Tue, 8 Feb 2022 at 18:48, Daire Byrne <daire@xxxxxxxx> wrote: > > > > On Wed, 26 Jan 2022 at 02:57, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > > > On Wed, Jan 26, 2022 at 11:02:16AM +1100, NeilBrown wrote: > > > > On Wed, 26 Jan 2022, J. Bruce Fields wrote: > > > > > On Tue, Jan 25, 2022 at 03:15:42PM -0600, Patrick Goetz wrote: > > > > > > So the directory is locked while the inode is created, or something > > > > > > like this, which makes sense. > > > > > > > > > > It accomplishes a number of things, details in > > > > > https://www.kernel.org/doc/html/latest/filesystems/directory-locking.html > > > > > > > > Just in case anyone is interested, I wrote this a while back: > > > > > > > > http://lists.lustre.org/pipermail/lustre-devel-lustre.org/2018-November/008177.html > > > > > > > > it includes a patch to allow parallel creates/deletes over NFS (and any > > > > other filesystem which adds support). > > > > I doubt it still applies, but it wouldn't be hard to make it work if > > > > anyone was willing to make a strong case that we would benefit from > > > > this. > > > > Well, I couldn't resist quickly testing Neil's patch. I found it > > applied okay to v5.6.19 with minimal edits. > > > > As before, without the patch, parallel file creates in a single > > directory with 1000 threads topped out at an aggregate of 3 creates/s > > over a 200ms link. With the patch it jumps up to 1,200 creates/s. > > > > So a pretty dramatic difference. I can't say if there are any other > > side effects or regressions as I only did this simple test. > > > > It's great for our super niche workloads and use case anyway. > > > > Daire > > > > > > > Neato. > > > > > > Removing the need to hold an exclusive lock on the directory across > > > server round trips seems compelling to me.... > > > > > > I also wonder: why couldn't you fire off the RPC without any locks, then > > > wait till you get a reply to take locks and update your local cache? > > > > > > OK, for one thing, calls and replies and server processing could all get > > > reordered. We'd need to know what order the server processed operations > > > in, so we could process replies in the same order. > > > > > > --b.