Re: Mon losing touch with OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 17, 2013 at 05:44:29PM -0800, Sage Weil wrote:
> On Mon, 18 Feb 2013, Chris Dunlop wrote:
>> On Sat, Feb 16, 2013 at 09:05:21AM +1100, Chris Dunlop wrote:
>>> On Thu, Feb 14, 2013 at 08:57:11PM -0800, Sage Weil wrote:
>>>> On Fri, 15 Feb 2013, Chris Dunlop wrote:
>>>>> In an otherwise seemingly healthy cluster (ceph 0.56.2), what might cause the
>>>>> mons to lose touch with the osds?
>>>> 
>>>> Can you enable 'debug ms = 1' on the mons and leave them that way, in the 
>>>> hopes that this happens again?  It will give us more information to go on.
>>> 
>>> Debug turned on.
>> 
>> We haven't experienced the cluster losing touch with the osds completely
>> since upgrading from 0.56.2 to 0.56.3, but we did lose touch with osd.1
>> for a few seconds before it recovered. See below for logs (reminder: 3
>> boxes, b2 is mon-only, b4 is mon+osd.0, b5 is mon+osd.1).
> 
> Hrm, I don't see any obvious clues.  You could enable 'debug ms = 1' on 
> the osds as well.  That will give us more to go on if/when it happens 
> again, and should not affect performance significantly.

Done: ceph osd tell '*' injectargs '--debug-ms 1'

Now to wait for it to happen again.

Chris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux