Cephfs, like most network filesystems, sucks badly at metadata-heavy workloads. The clients (kcephfs and libcephfs) always do synchronous calls to the MDS for directory-morphing operations (create, unlink, link and rename), and those RTT delays add up. In principle, cephfs is different in that if we have appropriate caps, we ought to be able to buffer up directory morphing operations and eventually flush them out to the MDS prior to releasing those caps. While the biggest win from this approach is probably going to be in the create codepath, starting with unlink is a lot simpler. The idea here is that if we hold refs on the appropriate caps (Fx on the directory and Lx on the inode being unlinked), then we should be able to return from the syscall immediately after transmitting the unlink request, under the assumption that it will succeed. If the unlink does fail, then we'd report an error when the caller does an fsync on the parent directory. The series starts with some reorganization that allows the client to do async MDS requests, and then the last several patches add the ability to do an asynchronous unlink. For now, this is just an RFC series. I think we could probably take the first 7 or so patches for the next merge window, but the async unlink patches themselves should probably wait until the MDS better supports this. I did do a little performance testing with this, but it doesn't seem to improve things much if at all. Still, this is a good place to start with async MDS ops, and we may be able to improve things later. Jeff Layton (11): ceph: after an MDS request, do callback and completions ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request ceph: move wait for mds request into helper function ceph: hold extra reference to r_parent over life of request ceph: fix comment over ceph_drop_caps_for_unlink ceph: simplify arguments and return semantics of try_get_cap_refs ceph: register MDS request with dir inode from the get-go ceph: add refcounting for Fx caps ceph: add refcounting for Lx caps ceph: perform asynchronous unlink if we have sufficient caps ceph: wait for async dir ops to complete before doing synchronous dir ops fs/ceph/caps.c | 114 +++++++++++++++++++++++++------------------ fs/ceph/dir.c | 111 ++++++++++++++++++++++++++++++++++++++--- fs/ceph/export.c | 1 + fs/ceph/file.c | 5 ++ fs/ceph/inode.c | 2 + fs/ceph/mds_client.c | 77 ++++++++++++++--------------- fs/ceph/mds_client.h | 5 +- fs/ceph/super.h | 4 +- 8 files changed, 224 insertions(+), 95 deletions(-) -- 2.20.1