Re: Mon losing touch with OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris-

Can you confirm that both ceph-osd daemons are running v0.56.3 (i.e., 
they were restarted after the upgrade)?

sage

On Fri, 22 Feb 2013, Sage Weil wrote:
> On Sat, 23 Feb 2013, Chris Dunlop wrote:
> > On Fri, Feb 22, 2013 at 03:43:22PM -0800, Sage Weil wrote:
> > > On Sat, 23 Feb 2013, Chris Dunlop wrote:
> > >> On Fri, Feb 22, 2013 at 01:57:32PM -0800, Sage Weil wrote:
> > >>> On Fri, 22 Feb 2013, Chris Dunlop wrote:
> > >>>> G'day,
> > >>>> 
> > >>>> It seems there might be two issues here: the first being the delayed
> > >>>> receipt of echo replies causing an seemingly otherwise healthy osd to be
> > >>>> marked down, the second being the lack of recovery once the downed osd is
> > >>>> recognised as up again.
> > >>>> 
> > >>>> Is it worth my opening tracker reports for this, just so it doesn't get
> > >>>> lost?
> > >>> 
> > >>> I just looked at the logs.  I can't tell what happend to cause that 10 
> > >>> second delay.. strangely, messages were passing from 0 -> 1, but nothing 
> > >>> came back from 1 -> 0 (although 1 was queuing, if not sending, them).
> > 
> > Is there any way of telling where they were delayed, i.e. in the 1's output
> > queue or 0's input queue?
> 
> Yeah, if you bump it up to 'debug ms = 20'.  Be aware that that will 
> generate a lot of logging, though.
> 
> > >>> The strange bit is that after this, you get those indefinite hangs.  From 
> > >>> the logs it looks like the OSD rebound to an old port that was previously 
> > >>> open from osd.0.. probably from way back.  Do you have logs going further 
> > >>> back than what you posted?  Also, do you have osdmaps, say, 750 and 
> > >>> onward?  It looks like there is a bug in the connection handling code 
> > >>> (that is unrelated to the delay above).
> > >> 
> > >> Currently uploading logs starting midnight to dropbox, will send
> > >> links when when they're up.
> > >> 
> > >> How would I retrieve the interesting osdmaps?
> > > 
> > > They are in the monitor data directory, in the osdmap_full dir.
> > 
> > Logs from midnight onwards and osdmaps are in this folder:
> > 
> > https://www.dropbox.com/sh/7nq7gr2u2deorcu/Nvw3FFGiy2
> > 
> >   ceph-mon.b2.log.bz2
> >   ceph-mon.b4.log.bz2
> >   ceph-mon.b5.log.bz2
> >   ceph-osd.0.log.bz2
> >   ceph-osd.1.log.bz2 (still uploading as I type)
> >   osdmaps.zip
> 
> I'll take a look...
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux