We had a bunch of disk who failed. That's why ceph was having trouble keeping OSD up. And we found that during recovery the rados gateway failed to initialize. The init_watch function timeout. As it is only used when cache is activated, we disable cache (rgw cache enable false) and the radosgateway started :) ! -----Original Message----- From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Yann ROBIN Sent: mardi 18 décembre 2012 15:08 To: ceph-devel@xxxxxxxxxxxxxxx Subject: RE: Recovery stuck and radosgateway not initializing Our configuration : 6 OSDs, 3 Mon. Journal is on INTEL SSDSA2CW120G3 disk and Data is on Hitachi HUS724040ALE640 disk. When OSD does recovery IO is high, and at some point the OSD is killed. We set max active recovery to 1 and set filestore op thread suicide timeout to 360. What should I do in that case ? -----Original Message----- From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Yann ROBIN Sent: mardi 18 décembre 2012 11:51 To: ceph-devel@xxxxxxxxxxxxxxx Subject: Recovery stuck and radosgateway not initializing Hi, We're using ceph v0.55, and last night we loste one node of our cluster. When it came back, ceph start recovering but since then the radosgateway could not connect to the cluster. The rados gateway timeout on initializtion (somewhere in the radosclient connect). The other problem (and I think it's related) is that the recovery isn't working. Osd gets OSD Op thread timeout and sometimes some of the OSD crash (see stacktrace attached). So it seems that our OSD aren't up long enough for the recovery to proceed. Any would be appreciated. Thanks, -- Yann -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html