RE: osd suicide timeout

"Deneau, Tom" <tom.deneau@xxxxxxx> · Mon, 13 Jul 2015 23:12:24 +0000

Greg --

Not sure how to tell whether rebalancing occurred at that time.
I do see in other osd logs complaints that they do not get a reply from
osd.8 starting around 15:52 on that day.

I see the deep-scrub of pool 14 but that was almost 30 minutes earlier.

-- Tom

> -----Original Message-----
> From: Gregory Farnum [mailto:greg@xxxxxxxxxxx]
> Sent: Monday, July 13, 2015 11:45 AM
> To: Deneau, Tom
> Cc: ceph-devel
> Subject: Re: osd suicide timeout
> 
> heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x3ff6eb0efd0' had suicide
> timed out after 150
> 
> So that's the OSD's "op thread", which is the one that does most of the work.
> You often see the FileStore::op_tp when it's the disk or filesystem breaking,
> but I do see the line
> 
> waiting 51 > 50 ops || 57248008 > 104857600
> 
> which leaves me feeling pretty confident that the disk is just getting more
> work than it can keep up with. It looks like there was some rebalancing
> happening around this time?
> -Greg
> 
> 
> 
> On Mon, Jul 13, 2015 at 5:04 PM, Deneau, Tom <tom.deneau@xxxxxxx> wrote:
> > Greg --
> >
> > Thanks.  I put the osd.log file at
> >
> > https://drive.google.com/file/d/0B_rfwWh40kPwQjZ3OXdjLUZNRVU/view?usp=
> > sharing
> >
> > I noticed the following from journalctl output around that time, so other
> nodes were complaining they could not reach osd.8.
> >
> > Jul 09 15:53:04 seattle-04-ausisv bash[8486]: 2015-07-09
> > 15:53:03.905386 3ffa0d9efd0 -1 osd.9 2487 heartbeat_check: no reply
> > from osd.8 since back 2015-07-09 15:52:43.256581 front 2015-07-09
> > 15:52:43.256581 (cutoff 2015-07-09 15:52:43.905384) Jul 09 15:53:06
> > seattle-04-ausisv bash[1060]: 2015-07-09 15:53:06.784069 3ff916fefd0
> > -1 osd.7 2487 heartbeat_check: no reply from osd.8 since back
> > 2015-07-09 15:52:46.474273 front 2015-07-09 15:52:46.474273 (cutoff
> > 2015-07-09 15:52:46.784066)
> >
> > and here is some sar data for the disk that osd.8 was controlling
> > (sde1=journal partition, sde2=data partition)
> >
> > 03:40:02 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz
> await     svctm     %util
> > 03:50:17 PM      sde1      9.67      0.00  19046.40   1970.32      0.35
> 35.86     30.62     29.60
> > 03:50:17 PM      sde2     60.47   1524.27  14597.67    266.63     24.45
> 404.30      8.54     51.67
> > 03:50:32 PM      sde1     12.13      0.00  18158.93   1496.62      0.25
> 20.66     17.58     21.33
> > 03:50:32 PM      sde2     28.00   1550.93  17958.33    696.76     10.54
> 376.50     13.52     37.87
> > 03:50:47 PM      sde1     12.73      0.00  25446.40   1998.41      0.31
> 24.19     22.30     28.40
> > 03:50:47 PM      sde2     51.60    338.67  18091.73    357.18     13.05
> 252.91      8.02     41.40
> > 03:51:02 PM      sde1     12.27      0.00  18790.40   1531.83      0.31
> 25.33     18.53     22.73
> > 03:51:02 PM      sde2     33.13   2635.20  18026.67    623.60      5.02
> 151.57     10.99     36.40
> > 03:51:17 PM      sde1     10.13      0.00  14557.87   1436.63      0.16
> 16.18     12.76     12.93
> > 03:51:17 PM      sde2     46.73   1107.73  12067.00    281.91      8.55
> 182.88      5.46     25.53
> > 03:51:32 PM      sde1     11.93      0.00  18594.13   1558.17      0.35
> 29.27     16.42     19.60
> > 03:51:32 PM      sde2     22.20    555.20  18834.33    873.40      4.24
> 191.08     13.51     30.00
> > 03:51:47 PM      sde1     18.00      0.00  13926.40    773.69      0.19
> 10.78     10.07     18.13
> > 03:51:47 PM      sde2     47.27   1652.80  10775.53    262.94     12.24
> 259.01      6.66     31.47
> > 03:52:02 PM      sde1     21.60      0.00  10845.87    502.12      0.24
> 11.08      9.75     21.07
> > 03:52:02 PM      sde2     34.33   1652.80   9089.13    312.87      7.43
> 216.41      8.45     29.00
> > 03:52:17 PM      sde1     19.87      0.00  20198.40   1016.70      0.33
> 16.85     13.46     26.73
> > 03:52:17 PM      sde2     35.60   2752.53  16355.53    536.74     11.90
> 333.33     10.90     38.80
> > 03:52:32 PM      sde1     22.54      0.00   8434.04    374.18      0.15
> 6.67      6.17     13.90
> > 03:52:32 PM      sde2     35.84   2738.30   4586.30    204.38      2.01
> 28.11      6.53     23.40
> > 03:52:47 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:52:47 PM      sde2     13.37     35.83   1101.80     85.09      1.87
> 218.65      5.75      7.69
> > 03:53:02 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:02 PM      sde2      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:17 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:17 PM      sde2      0.13      0.00      0.20      1.50      0.00
> 20.00     20.00      0.27
> > 03:53:32 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:32 PM      sde2      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:47 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:53:47 PM      sde2      0.13      0.00      0.20      1.50      0.00
> 5.00      5.00      0.07
> > 03:54:02 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:54:02 PM      sde2      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:54:17 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:54:17 PM      sde2      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:54:32 PM      sde1      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> > 03:54:32 PM      sde2      0.00      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> >
> > -- Tom
> >
> >> -----Original Message-----
> >> From: Gregory Farnum [mailto:greg@xxxxxxxxxxx]
> >> Sent: Monday, July 13, 2015 5:07 AM
> >> To: Deneau, Tom
> >> Cc: ceph-devel
> >> Subject: Re: osd suicide timeout
> >>
> >> On Fri, Jul 10, 2015 at 10:45 PM, Deneau, Tom <tom.deneau@xxxxxxx> wrote:
> >> > I have an osd log file from an osd that hit a suicide timeout (with
> >> > the
> >> previous 10000 events logged).
> >> > (On this node I have also seen this suicide timeout happen once
> >> > before and
> >> also a sync_entry timeout.
> >> >
> >> > I can see that 6 minutes or so before that osd died, other osds on
> >> > the same node were logging messages such as
> >> >     heartbeat_check: no reply from osd.8 so it appears that osd8
> >> > stopped responding quite some time before it died.
> >>
> >> It does this deliberately when its disk threads look like they might
> >> be stuck.
> >>
> >> >
> >> > I'm wondering if there is enough information in the osd8 log file
> >> > to deduce
> >> why osd 8 stopped responding?
> >> > I don't know enough to figure it out myself.
> >> >
> >> > Is there any expert who would be willing to take a look at the log file?
> >>
> >> The logs will have a backtrace in them; if you can include that and
> >> the last hundred lines or so prior in a pastebin that you email the
> >> list with several people can give you a pretty good idea of what's going
> on.
> >>
> >> In general though, it's just going to be that the disk can't keep up
> >> with the load being applied to it. That could be because it's
> >> failing, or because you're pushing too much work on to it in some fashion.
> >> -Greg
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f