So I took a power hit today and after coming back up 3 of my osds and my radosgw are not coming back up. The logs show no clue as to what may have happened.
When I manually try to restart the gateway I see the following in the logs: 2013-10-10 16:04:23.166046 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:04:45.166193 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:05:07.166335 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:05:29.166501 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:05:51.166638 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:06:13.166762 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:06:35.166914 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:06:57.167055 7f8480d9a700 2 RGWDataChangesLog::ChangesRenewThread: start 2013-10-10 16:07:10.196475 7f848535c700 -1 Initialization timeout, failed to initialize and then the process dies. As for the OSDs, there is no logging. I try to manually start them and it reports they are already running all their are no OSD pids on that server. $ sudo start ceph-all start: Job is already running: ceph-all Any ideas where to look for more info on these two issues? I am running ceph 0.67.3. Cluster status : HEALTH_WARN 78 pgs down; 78 pgs peering; 78 pgs stuck inactive; 78 pgs stuck unclean; 16 requests are blocked > 32 sec; 1 osds have slow requests ceph osd stat e134: 18 osds: 15 up, 15 in Thanks, Mike |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com