On Thu, Apr 04, 2019 at 11:09:47AM -0400, Jeff Layton wrote: > On Wed, Apr 3, 2019 at 9:06 PM bfields@xxxxxxxxxxxx <bfields@xxxxxxxxxxxx> wrote: > The serialized create with something like an untar is a > performance-killer though. Yes. And Trond's proposal only allows hiding the server-to-disk round trip time, not the client-to-server round trip time. On the other hand, it seems a lot easier than write delegations. > FWIW, I'm working on something similar right now for Ceph. If a ceph > client has adequate caps [1] for a directory and the dentry inode, > then we should (in principle) be able to buffer up directory morphing > operations and flush them out to the server asynchronously. > > I'm starting with unlink (mostly because it's simpler), and am mainly > just returning early when we do have the right caps -- after issuing > the call but before the reply comes in. We should be able to do the > same for link, rename and create too. Create will require the Ceph MDS > to delegate out a range of inode numbers (and that bit hasn't been > implemented yet). Is there some reason it's impossible for the client to return from create before it has an inode number? > My thinking with all of this is that the buffering of directory > morphing operations is not as helpful as something like a pagecache > write is, as we aren't that interested in merging operations that > change the same dentry. However, being able to do them asynchronously > should work really well. That should allow us to better parallellize > create/link/unlink/rename on different dentries even when they are > issued serially by a single task. > > RFC5661 doesn't currently provide for writeable directory delegations, > AFAICT, but they could eventually be implemented in a similar way. People also worried about delegating create in the face of differing rules about case insensitivity and about which characters are legal in filenames. But I really think there should be some way to manage that. --b.