Re: Peering and disk utilization

Stefan Priebe <s.priebe@xxxxxxxxxxxx> · Fri, 31 May 2013 20:48:16 +0200

This sounds also a bit like my 2nd problem here:
http://tracker.ceph.com/issues/5216
Am 31.05.2013 20:36, schrieb John Nielsen:
Possibly related:
http://tracker.ceph.com/issues/5084

I'm seeing the same big delays with peering, and when I today marked an OSD "out" then "in" after a minute or two it was unexpectedly marked "down". I restarted it and 8 or so minutes later things were fine again. In the meantime our RBD KVM instances were blocking on I/O (writes especially), making them unresponsive.

On May 3, 2013, at 3:30 AM, Erdem Agaoglu <erdem.agaoglu@xxxxxxxxx> wrote:

I'm not sure if the problems we are seeing are the same, but it looks like it. Just a few hours ago, one slow OSD caused a lot of problems for us. It is somehow reported down, and while cluster was trying to adjust, it said it was wrongly marked down. So it seems some pgs were stuck in peering. We restarted the OSD, cluster adjusted, after a while it is reported down again and the whole process repeated. We thought we should keep the OSD down, set noup, waited a while, with no luck, repeated. Even if there seems no hardware problem we decided to set the osd out and started recovery. Initial peering as you said seems so much resource intensive that it caused another ~10 OSDs to be reported down, which increased the number of pgs in peering, then they all said they're wrongly marked down... We already lowered all the recovery parameters, it takes about 2-3 hours now, but that doesn't make any difference in the starting phase of the recovery process which may take up to 10 mi
n
  utes. We have RBD backed KVM instances and they are totally frozen for those 10 minutes. And if some pgs are stuck in peering, it requires manual operation (a restart is what we could come up with) before anything can actually continue working. We've found http://www.spinics.net/lists/ceph-users/msg00009.html but it doesn't offer much. We run 0.56.4.

On Thu, May 2, 2013 at 4:57 PM, Andrey Korolyov <andrey@xxxxxxx> wrote:
Hello,

Speaking of rotating-media-under-filestore case(must be most common in
Ceph deployments), can peering be less greedy for disk operations
without slowing down entire 'blackhole timeout', e.g. when it blocks
client operations? I`m suffering of very long and very disk-intensive
peering process even on relatively small reweighs on more or less
significant commit on the underlying storage(50% are very hard to deal
with, 10% of disk commit way more acceptable). Recovery by itself can
be throttled low enough to not compete with I/O disk operations from
clients but slowing peering process means freezing client` I/O for
longer time, that`s all.
Cuttlefish seems to do a part of disk controller` job for merging
writes, but peering is still unacceptably long for _IOPS_-intensive
cluster(5Mb/s and 800 IOPS on every disk during peering, despite
controller aligning head movements, disks are 100% busy). SSD-based
cluster which should not die under lack of IOPS, but prices for such
thing still closer to the TrueEnterpriseStorage(tm) than any solution
I can afford.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com