Hi,
I got a recommendation From Stephan to restart the OSDs one by one. So I did it. It helped a bit (some IOs completed), but at the end, the state was the same as before, and new IOs still hung. Loïc, thanks for the advice on moving back the osd.0 and osd.4 into the game.
Actually this was done by simply restarting ceph on that node:
[root@qvitblhat12 ~]# date;service ceph status Tue Dec 23 14:36:11 UTC 2014 === osd.0 === osd.0: running {"version":"0.80.7"} === osd.4 === osd.4: running {"version":"0.80.7"} [root@qvitblhat12 ~]# date;service ceph restart Tue Dec 23 14:36:17 UTC 2014 === osd.0 === === osd.0 === Stopping Ceph osd.0 on qvitblhat12...kill 4527...kill 4527...done === osd.0 === create-or-move updating item name 'osd.0' weight 0.27 at location {host=qvitblhat12,root=default} to crush map Starting Ceph osd.0 on qvitblhat12... Running as unit run-4398.service. === osd.4 === === osd.4 === Stopping Ceph osd.4 on qvitblhat12...kill 5375...done === osd.4 === create-or-move updating item name 'osd.4' weight 0.27 at location {host=qvitblhat12,root=default} to crush map Starting Ceph osd.4 on qvitblhat12... Running as unit run-4720.service. [root@qvitblhat06 ~]# ceph osd tree # id weight type name up/down reweight -1 1.62 root default -5 1.08 datacenter dc_XAT -2 0.54 host qvitblhat10 1 0.27 osd.1 up 1 5 0.27 osd.5 up 1 -4 0.54 host qvitblhat12 0 0.27 osd.0 up 1 4 0.27 osd.4 up 1 -6 0.54 datacenter dc_QVI -3 0.54 host qvitblhat11 2 0.27 osd.2 up 1 3 0.27 osd.3 up 1 [root@qvitblhat06 ~]# This change made ceph to rebalance data, and then the miracle, as all PGs ended up as active+clean. [root@qvitblhat06 ~]# ceph health detail HEALTH_WARN noscrub,nodeep-scrub flag(s) set noscrub,nodeep-scrub flag(s) set Well apart from being happy that the cluster is now healthy, I find it a little bit scary of having to shake it in one direction and another and hope that it will eventually recover, while in the meantime my users IOs are stuck... So is there a way to understand what happened ? Francois |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com