Re: [RFC PATCH 00/11] ceph: asynchronous unlink support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 9, 2019 at 3:42 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> Cephfs, like most network filesystems, sucks badly at metadata-heavy
> workloads. The clients (kcephfs and libcephfs) always do synchronous
> calls to the MDS for directory-morphing operations (create, unlink, link
> and rename), and those RTT delays add up.
>
> In principle, cephfs is different in that if we have appropriate caps,
> we ought to be able to buffer up directory morphing operations and
> eventually flush them out to the MDS prior to releasing those caps.
> While the biggest win from this approach is probably going to be in the
> create codepath, starting with unlink is a lot simpler.
>
> The idea here is that if we hold refs on the appropriate caps (Fx on the
> directory and Lx on the inode being unlinked), then we should be able to
> return from the syscall immediately after transmitting the unlink
> request, under the assumption that it will succeed. If the unlink does
> fail, then we'd report an error when the caller does an fsync on the
> parent directory.
>
> The series starts with some reorganization that allows the client to
> do async MDS requests, and then the last several patches add the ability
> to do an asynchronous unlink.
>
> For now, this is just an RFC series. I think we could probably take the
> first 7 or so patches for the next merge window, but the async unlink
> patches themselves should probably wait until the MDS better supports
> this.
>
> I did do a little performance testing with this, but it doesn't seem to
> improve things much if at all. Still, this is a good place to start with
> async MDS ops, and we may be able to improve things later.
>

So it turns out that with some bugfixes that I do now see about a 2x
speedup when removing a directory with 10000 files in it. I think
that's enough of a proof of concept that this would be worthwhile,
particularly once we are able to create file asynchronously.

Simple test script:

--------------8<-----------------
#!/bin/sh

TESTDIR=/mnt/cephfs/test.$$

mkdir $TESTDIR
for i in `seq 1 10000`; do
    touch $TESTDIR/$i
done
time rm -r $TESTDIR
--------------8<-----------------

Testing on my crappy test rig:

Unpatched kernel:

$ ./test_unlink.sh
real    0m2.428s
user    0m0.011s
sys    0m0.131s

Patched kernel:

$ ./test_unlink.sh
real    0m1.272s
user    0m0.007s
sys    0m0.127s

...and the numbers were fairly consistent over multiple runs.

I pushed a tag to my repo if anyone wants to have a look, but I'll
avoid re-posting for now. This relies on some out of tree (and quite
possibly dangerous) MDS patches too, so it's probably not worth wider
testing just yet.

    https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git/tag/?h=ceph-async-unlink-20190410

-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux