Re: Hit suicide timeout after adding new osd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/23/2013 01:14 PM, Jens Kristian Søgaard wrote:
Hi Sage,

I think the problem now is just that 'osd target transaction size' is
too big (default is 300).  Recommended 50.. let's see how that goes.
Even smaller (20 or 25) would probably be fine.


Going through the code and reading that this solved it for Jens, could this issue be traced back to less powerful CPUs?

I've seen this on Atom and Fusion platforms which both don't excel in their computing power.

From what I read is that the OSD by default does 300 transactions and then commits them? If the CPU is to slow to handle all the work timeouts can occur because it can't do all the transactions inside the set window?

By lowering the number of transactions it sends out a heartbeat more often thus keeping itself alive.

Correct?

Wido

I set it to 50, and that seems to have solved all my problems.

After a day or so my cluster got to a HEALTH_OK state again. It has been
running for a few days now without any crashes!

Thanks for all your help!


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux