On Sun, 9 Oct 2011, Martin Mailand wrote: > Hi, > I am using v3.1-rc9, so the fix in there. Maybe I can nail it down a bit more > specific. You might try sysrq-t or -w to see what the spinning CPUs are doing. Thanks! sage > > Best Regards, > martin > > Sage Weil schrieb: > > Hi Christian, > > > > On Sat, 8 Oct 2011, Christian Brunner wrote: > > > Hi, > > > > > > I've upgraded ceph from 0.32 to 0.36 yesterday. Now I have a totaly > > > screwed ceph cluster. :( > > > > > > What bugs me most is the fact, that OSDs become unresponsive > > > frequently. The process is eating a lot of cpu and I can see the > > > > What version of btrfs are you running? This sound a bit like the bug fixed > > by this patch: > > > > http://www.spinics.net/lists/linux-btrfs/msg12627.html > > > > (That was just merged into mainline this week.) > > > > > following messages in the log: > > > > > > Oct 8 22:30:05 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > Oct 8 22:30:10 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > Oct 8 22:30:15 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > Oct 8 22:30:20 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > Oct 8 22:30:25 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > Oct 8 22:30:30 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map > > > is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60 > > > > > > Do you have any idea, what to do about that? > > > > Those messages just mean that a thread in the disk threadpool (which is > > doing all the writes to btrfs) is blocked/stopped. > > > > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html