Re: [WRN] map e### wrongly marked me down or wrong addr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Feb 2012, Székelyi Szabolcs wrote:
> Hello,
> 
> whenever I restart osd.0 I see a pair of messages like
> 
> 2012-02-27 17:26:00.132666 mon.0 <osd_1_ip>:6789/0 106 : [INF] osd.0 
> <osd_0_ip>:6801/29931 failed (by osd.1 <osd_1_ip>:6806/20125)
> 2012-02-27 17:26:21.074926 osd.0 <osd_0_ip>:6801/29931 1 : [WRN] map e370 
> wrongly marked me down or wrong addr
> 
> a couple of times. The situation stabilizes in a normal state after about two 
> minutes.
> 
> Should I worry about this? Maybe the first message is about the just killed 
> OSD, and the second comes from the new incarnation, and this is completely 
> normal? This is Ceph 0.41.

It's not normal.  Wido was seeing something similar, I think.  I suspect 
the problem is that during startup ceph-osd just busy, but the heartbeat 
code is such that it's not supposed to miss them.  

Can you reproduce this with 'debug ms = 1'?

sage

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux