2011/5/2 Sage Weil <sage@xxxxxxxxxxxx>: > On Mon, 2 May 2011, Christian Brunner wrote: >> after a series of hardware defects, I have a corrupted ceph cluster: >> >> 2011-05-02 18:12:31.038446 pg v8171648: 3712 pgs: 26 active, 3663 >> active+clean, 5 crashed+peering, 18 active+clean+inconsistent; 547 GB >> data, 388 GB used, 51922 GB / 78245 GB avail; 2410/284300 degraded >> (0.848%) >> >> Now I wanted to export an rbd-image with "rbd export" and run a >> filesystem-check on the image. The only problem is, that the export is >> blocking on the first corrupted object. I think it would be better to >> detect the failure and return some blocks filled with zero. >> >> Is there a way to accomplish this? > > Not currently. There are a couple ways to approach it. > > One would be to a timeout (either in librados or in the rados tool) so > that it can move past unresponsive blocks (or error out). Maybe a 'skip > this block range' option would be a piece of that. > > The other is to give the client explicit feedback when the pg it is > attempting to access is not available (in your case, it's the peering pgs > that are blocking progress). Currently those requests block at the OSD > until peering completes, but a peering bug is preventing progress. > > Of course, we also need to fix the peering issue itself (any logs you can > provide showing it blocking would help). If you ceph pg dump and look at > which ones are in peering, and restart those osds with full logs, we can > see where things are getting hung up. We had a rather old version running (0.24.1). I don't think that debuging it makes much sense. I have updated to 0.27 now. > Probably, though, we still want a way to do useful work > (partial/incomplete export) even when things are half-broken. This was my intent, when I wrote the email. In a larger cluster the probability of loosing multiple disks at a time increases. The amount of data you loose, when it happens is minimal, but since the rbd images are striped among many disks, chances are that you loose a single block in many images. Christian -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html