Re: cephfs slow delete

"Heller, Chris" <cheller@xxxxxxxxxx> · Fri, 14 Oct 2016 20:11:47 +0000

Ok. Since I’m running through the Hadoop/ceph api, there is no syscall boundary so there is a simple place to improve the throughput here. Good to know, I’ll work on a patch…

On 10/14/16, 3:58 PM, "Gregory Farnum" <gfarnum@xxxxxxxxxx> wrote:

    On Fri, Oct 14, 2016 at 11:41 AM, Heller, Chris <cheller@xxxxxxxxxx> wrote:
    > Unfortunately, it was all in the unlink operation. Looks as if it took nearly 20 hours to remove the dir, roundtrip is a killer there. What can be done to reduce RTT to the MDS? Does the client really have to sequentially delete directories or can it have internal batching or parallelization?

    It's bound by the same syscall APIs as anything else. You can spin off
    multiple deleters; I'd either keep them on one client (if you want to
    work within a single directory) or if using multiple clients assign
    them to different portions of the hierarchy. That will let you
    parallelize across the IO latency until you hit a cap on the MDS'
    total throughput (should be 1-10k deletes/s based on latest tests
    IIRC).
    -Greg

    >
    > -Chris
    >
    > On 10/13/16, 4:22 PM, "Gregory Farnum" <gfarnum@xxxxxxxxxx> wrote:
    >
    >     On Thu, Oct 13, 2016 at 12:44 PM, Heller, Chris <cheller@xxxxxxxxxx> wrote:
    >     > I have a directory I’ve been trying to remove from cephfs (via
    >     > cephfs-hadoop), the directory is a few hundred gigabytes in size and
    >     > contains a few million files, but not in a single sub directory. I startd
    >     > the delete yesterday at around 6:30 EST, and it’s still progressing. I can
    >     > see from (ceph osd df) that the overall data usage on my cluster is
    >     > decreasing, but at the rate its going it will be a month before the entire
    >     > sub directory is gone. Is a recursive delete of a directory known to be a
    >     > slow operation in CephFS or have I hit upon some bad configuration? What
    >     > steps can I take to better debug this scenario?
    >
    >     Is it the actual unlink operation taking a long time, or just the
    >     reduction in used space? Unlinks require a round trip to the MDS
    >     unfortunately, but you should be able to speed things up at least some
    >     by issuing them in parallel on different directories.
    >
    >     If it's the used space, you can let the MDS issue more RADOS delete
    >     ops by adjusting the "mds max purge files" and "mds max purge ops"
    >     config values.
    >     -Greg
    >
    >

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com