Hi, I've upgraded ceph from 0.32 to 0.36 yesterday. Now I have a totaly screwed ceph cluster. :( What bugs me most is the fact, that OSDs become unresponsive frequently. The process is eating a lot of cpu and I can see the following messages in the log: Oct 8 22:30:05 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Oct 8 22:30:10 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Oct 8 22:30:15 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Oct 8 22:30:20 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Oct 8 22:30:25 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Oct 8 22:30:30 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 Do you have any idea, what to do about that? Regards, Christian -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html