Re: Suicide

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I looked through your logs a bit and noticed that the OSD on node01 is crashing due to high latencies on disk access (I think the defaults for this case are it asserts out if there's no progress after 10 minutes or something). 

Based on that, I pretty much have to guess that there's just too much stress on your disk and it's going to cause problems. You can try loosening the various configurable timeouts to let it run longer but it seems like really you just need beefier disks for the amount of stuff you're doing to them. IIRC you're running a monitor and an OSD on the same 2.5" physical disk, which means they're colliding on stuff like sync() calls.

This general slowness doesn't explain the mds log corruption, although it might be one of the trigger conditions. I added another assert in the Journaler code which might have caused the problem (though I don't think it could have) but don't have any other new ideas.
-Greg



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux