Re: Blocked requests during and after CephFS delete

Gregory Farnum <greg@xxxxxxxxxxx> · Sun, 8 Dec 2013 19:50:17 -0800



On Sun, Dec 8, 2013 at 7:16 AM, Oliver Schulz <oschulz@xxxxxxxxxx> wrote:
> Hello Ceph-Gurus,
>
> a short while ago I reported some trouble we had with our cluster
> suddenly going into a state of "blocked requests".
>
> We did a few tests, and we can reproduce the problem:
> During / after deleting of a substantial chunk of data on
> CephFS (a few TB), ceph health shows blocked requests like
>
>     HEALTH_WARN 222 requests are blocked > 32 sec
>
> This goes on for a couple of minutes, during which the cluster is
> pretty much unusable. The number of blocked requests jumps around
> (but seems to go down on average), until finally (after about 15
> minutes in my last test) health is back to OK.
>
> I upgraded the cluster to Ceph emperor (0.72.1) and repeated the
> test, but the problem persists.
>
> Is this normal - and if not, what might be the reason? Obviously,
> having the cluster go on strike for a while after data deletion
> is a bit of a problem, especially with a mixed application load.
> The VM's running on RBDs aren't too happy about it, for example. ;-)

Nobody's reported it before, but I think the CephFS MDS is sending out
too many delete requests. When you delete something in CephFS, it's
just marked as deleted and the MDS is supposed to do so asynchronously
in the background, but I'm not sure if there are any throttles on how
quickly it does so. If you remove several terabytes worth of data, and
the MDS is sending out RADOS object deletes for each 4MB as fast as it
can, that's a lot of unfiltered traffic on the OSDs.
That's all speculation on my part though; can you go sample the slow
requests and see what their makeup looked like? Do you have logs from
the MDS or OSDs during that time period?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com