Blocked requests/ops?

chibi@xxxxxxx (Christian Balzer) · Wed, 27 May 2015 10:53:37 +0900

Hello,

On Tue, 26 May 2015 10:00:13 -0600 Robert LeBlanc wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> I've seen I/O become stuck after we have done network torture tests.
> It seems that after so many retries that the OSD peering just gives up
> and doesn't retry any more. An OSD restart kicks off another round of
> retries and the I/O completes. It seems like there was some discussion
> about this on the devel list recently.
>
While that sounds certainly plausible, the Ceph network of my cluster
wasn't particular busy or tortured at that time at all.
I suppose other factors might cause a similar behavior, so a good way
forward would probably to ensure that retries will happen with no
limitation and in a "reasonable" interval.

As for Xavier, no I never filed a bug, that thread was all there is.
Since I didn't have anything other to report than "it happened" and
neither do you really, it is doubtful the devs can figure out what exactly
caused it.
So as I wrote above, probably best to make sure it keeps retrying no
matter what.

Christian
> - ----------------
> Robert LeBlanc
> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Tue, May 26, 2015 at 4:06 AM, Xavier Serrano  wrote:
> > Hello,
> >
> > Thanks for your detailed explanation, and for the pointer to the
> > "Unexplainable slow request" thread.
> >
> > After investigating osd logs, disk SMART status, etc., the disk under
> > osd.71 seems OK, so we restarted the osd... And voil?, problem seems
> > to be solved! (or at least, the "slow request" message disappeared).
> >
> > But this really does not make me happy (and neither are you, Christian,
> > I'm afraid). I understand that it is not acceptable that sometimes,
> > apparently randomly, slow requests do happen and they remain stuck
> > until an operator manually restarts the affected osd.
> >
> > My question now is: did you file a bug to ceph developers?
> > What did they say? Could you provide me the links? I would like
> > to reopen the issue if possible, and see if we can find a
> > solution for this.
> >
> > About our cluster (testing, not production):
> >  - ceph version 0.94.1
> >  - all hosts running Ubuntu 14.04 LTS 64-bits, kernel 3.16
> >  - 5 monitors, 128GB RAM each
> >  - 6 osd hosts, 32GB RAM each, 20 osds per host, 1 HDD WD Green 2TB
> > per osd
> >  - (and 6 more osds host to arrive soon)
> >  - 10 GbE interconnection
> >
> >
> > Thank you very much indeed.
> > Best regards,
> > - Xavier Serrano
> > - LCAC, Laboratori de C?lcul
> > - Departament d'Arquitectura de Computadors, UPC
> >
> >
> > On Tue May 26 14:19:22 2015, Christian Balzer wrote:
> >
> >>
> >> Hello,
> >>
> >> Firstly, find my "Unexplainable slow request" thread in the ML
> >> archives and read all of it.
> >>
> >> On Tue, 26 May 2015 07:05:36 +0200 Xavier Serrano wrote:
> >>
> >> > Hello,
> >> >
> >> > We have observed that our cluster is often moving back and forth
> >> > from HEALTH_OK to HEALTH_WARN states due to "blocked requests".
> >> > We have also observed "blocked ops". For instance:
> >> >
> >> As always SW versions and a detailed HW description (down to the
> >> model of HDDs used) will be helpful and educational.
> >>
> >> > # ceph status
> >> >     cluster 905a1185-b4f0-4664-b881-f0ad2d8be964
> >> >      health HEALTH_WARN
> >> >             1 requests are blocked > 32 sec
> >> >      monmap e5: 5 mons at
> >> > {ceph-host-1=192.168.0.65:6789/0,ceph-host-2=192.168.0.66:6789/0,ceph-host-3=192.168.0.67:6789/0,ceph-host-4=192.168.0.68:6789/0,ceph-host-5=192.168.0.69:6789/0}
> >> > election epoch 44, quorum 0,1,2,3,4
> >> > ceph-host-1,ceph-host-2,ceph-host-3,ceph-host-4,ceph-host-5 osdmap
> >> > e5091: 120 osds: 100 up, 100 in pgmap v473436: 2048 pgs, 2 pools,
> >> > 4373 GB data, 1093 kobjects 13164 GB used, 168 TB / 181 TB avail
> >> > 2048 active+clean client io 10574 kB/s rd, 33883 kB/s wr, 655 op/s
> >> >
> >> > # ceph health detail
> >> > HEALTH_WARN 1 requests are blocked > 32 sec; 1 osds have slow
> >> > requests 1 ops are blocked > 67108.9 sec
> >> > 1 ops are blocked > 67108.9 sec on osd.71
> >> > 1 osds have slow requests
> >> >
> >> You will want to have a very close look at osd.71 (logs, internal
> >> counters, cranking up debugging), but might find it just as
> >> mysterious as my case in the thread mentioned above.
> >>
> >> >
> >> > My questions are:
> >> > (1) Is it normal to have "slow requests" in a cluster?
> >> Not really, though the Ceph developers clearly think those just
> >> merits a WARNING level, whereas I would consider those a clear sign
> >> of brokenness, as VMs or other clients with those requests pending
> >> are likely to be unusable at that point.
> >>
> >> > (2) Or is it a symptom that indicates that something is wrong?
> >> >     (for example, a disk is about to fail)
> >> That. Of course your cluster could be just at the edge of its
> >> performance and nothing but improving that (most likely by adding
> >> more nodes/OSDs) would fix that.
> >>
> >> > (3) How can we fix the "slow requests"?
> >> Depends on cause of course.
> >> AFTER you exhausted all means and gotten all relevant log/performance
> >> data from osd.71 restarting the osd might be all that's needed.
> >>
> >> > (4) What's the meaning of "blocked ops", and how can they be
> >> >     blocked so long? (67000 seconds is more than 18 hours!)
> >> Precisely, this shouldn't happen.
> >>
> >> > (5) How can we fix the "blocked ops"?
> >> >
> >> AFTER you exhausted all means and gotten all relevant log/performance
> >> data from osd.71 restarting the osd might be all that's needed.
> >>
> >> Christian
> >> --
> >> Christian Balzer        Network/Systems Engineer
> >> chibi at gol.com         Global OnLine Japan/Fusion Communications
> >> http://www.gol.com/
> >
> > --
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> -----BEGIN PGP SIGNATURE-----
> Version: Mailvelope v0.13.1
> Comment: https://www.mailvelope.com
> 
> wsFcBAEBCAAQBQJVZJiKCRDmVDuy+mK58QAAytMP+wUFShHEY3daOTmGLS/e
> 2kbFIXXmI05Hcz+NrFwr46xeU5Sjwm1cD/dvEansj+bbMcP0D2A9cejSHSRU
> EdKpTZn+4PNGOVsUaMcrRaPHuRBe5H4ffbAFAO8sokchjeAGvnYGaRi8+aYt
> bMNldg8YeZU3otHNNYVNxrXDLmdWxbZLlklrOt+oEKymKEwhwwASrqitPlss
> jBTl8b30Bl2GS0JKOzeEip3MhHlspY8b0mnTVwqB9K9jVm4u6WXbmkppGQBo
> NgPgGyBAhgw6GZupd5IeN5wVFC/cnQAjn805J6VC3sgOFHmmq9OT8DCVmoZ7
> t0WBdOU76lK0p227/bs7a0aSoh3KRJIZxxg73P9m7oKfn0Q4qZZV/R0QNp7u
> 6qx+6UEmTd94dRtun3R0cYEl+/mdUmFelwCXTUqDm/I4/rjZVc9VwD3WNjwG
> vNjL4b8SsDvSvYMGhRfwQCiWIOTredZHQj/W3QJp/IWEOdhnpayysX/uvUwQ
> TPORaub8ecfhGKSxiXxWdVq8h6rL00XukMXlNpAwYegy9HlQL5V4k6NjoS9x
> ngjf1UCVVR2t+lkdkL4e5jorIe0t8SWY5+ScIDBLc3t93fv0zsnulc6lWrG+
> zUDbHusDzXXTfU9mvC3ZT4Rn+qEX6x4aJkJyhHgmhDmejlP6vkiMgBKM0Djw
> yodA
> =YmVV
> -----END PGP SIGNATURE-----
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/