Re: cosd multi-second stalls cause "wrongly marked me down"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 16 Feb 2011, Jim Schutt wrote:
> On Wed, 2011-02-16 at 14:40 -0700, Gregory Farnum wrote:
> > On Wednesday, February 16, 2011 at 1:25 PM, Jim Schutt wrote:
> > > Hi,
> > >
> > > I've been testing v0.24.3 w/ 64 clients against
> > > 1 mon, 1 mds, 96 osds. Under heavy write load I
> > > see:
> > >  [WRN] map e7 wrongly marked me down or wrong addr
> > >
> > > I was able to sort through the logs and discover that when
> > > this happens I have large gaps (10 seconds or more) in osd
> > > heatbeat processing. In those heartbeat gaps I've discovered
> > > long periods (5-15 seconds) where an osd logs nothing, even
> > > though I am running with debug osd/filestore/journal = 20.
> > >
> > > Is this a known issue?
> > 
> > You're running on btrfs? 
> 
> Yep.

Are the cosd log files on the same btrfs volume as the btrfs data, or 
elsewhere?  The heartbeat thread takes some pains to avoid any locks that 
may be contented and do avoid any disk io, so in theory a btrfs stall 
shouldn't affect anything.  We may have missed something.. do you have a 
log showing this in action?

sage


> 
> > We've come across some issues involving very long sync times that I believe manifest like this. Sage is looking into them, although it's delayed at the moment thanks to FAST 11. :)
> 
> OK, great.
> 
> -- Jim
> 
> > -Greg
> > 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux