Blocked requests/ops?

robert@xxxxxxxxxxxxx (Robert LeBlanc) · Tue, 26 May 2015 10:00:13 -0600



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I've seen I/O become stuck after we have done network torture tests.
It seems that after so many retries that the OSD peering just gives up
and doesn't retry any more. An OSD restart kicks off another round of
retries and the I/O completes. It seems like there was some discussion
about this on the devel list recently.
- ----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 26, 2015 at 4:06 AM, Xavier Serrano  wrote:
> Hello,
>
> Thanks for your detailed explanation, and for the pointer to the
> "Unexplainable slow request" thread.
>
> After investigating osd logs, disk SMART status, etc., the disk under
> osd.71 seems OK, so we restarted the osd... And voil?, problem seems
> to be solved! (or at least, the "slow request" message disappeared).
>
> But this really does not make me happy (and neither are you, Christian,
> I'm afraid). I understand that it is not acceptable that sometimes,
> apparently randomly, slow requests do happen and they remain stuck until
> an operator manually restarts the affected osd.
>
> My question now is: did you file a bug to ceph developers?
> What did they say? Could you provide me the links? I would like
> to reopen the issue if possible, and see if we can find a
> solution for this.
>
> About our cluster (testing, not production):
>  - ceph version 0.94.1
>  - all hosts running Ubuntu 14.04 LTS 64-bits, kernel 3.16
>  - 5 monitors, 128GB RAM each
>  - 6 osd hosts, 32GB RAM each, 20 osds per host, 1 HDD WD Green 2TB per osd
>  - (and 6 more osds host to arrive soon)
>  - 10 GbE interconnection
>
>
> Thank you very much indeed.
> Best regards,
> - Xavier Serrano
> - LCAC, Laboratori de C?lcul
> - Departament d'Arquitectura de Computadors, UPC
>
>
> On Tue May 26 14:19:22 2015, Christian Balzer wrote:
>
>>
>> Hello,
>>
>> Firstly, find my "Unexplainable slow request" thread in the ML archives
>> and read all of it.
>>
>> On Tue, 26 May 2015 07:05:36 +0200 Xavier Serrano wrote:
>>
>> > Hello,
>> >
>> > We have observed that our cluster is often moving back and forth
>> > from HEALTH_OK to HEALTH_WARN states due to "blocked requests".
>> > We have also observed "blocked ops". For instance:
>> >
>> As always SW versions and a detailed HW description (down to the model of
>> HDDs used) will be helpful and educational.
>>
>> > # ceph status
>> >     cluster 905a1185-b4f0-4664-b881-f0ad2d8be964
>> >      health HEALTH_WARN
>> >             1 requests are blocked > 32 sec
>> >      monmap e5: 5 mons at
>> > {ceph-host-1=192.168.0.65:6789/0,ceph-host-2=192.168.0.66:6789/0,ceph-host-3=192.168.0.67:6789/0,ceph-host-4=192.168.0.68:6789/0,ceph-host-5=192.168.0.69:6789/0}
>> > election epoch 44, quorum 0,1,2,3,4
>> > ceph-host-1,ceph-host-2,ceph-host-3,ceph-host-4,ceph-host-5 osdmap
>> > e5091: 120 osds: 100 up, 100 in pgmap v473436: 2048 pgs, 2 pools, 4373
>> > GB data, 1093 kobjects 13164 GB used, 168 TB / 181 TB avail 2048
>> > active+clean client io 10574 kB/s rd, 33883 kB/s wr, 655 op/s
>> >
>> > # ceph health detail
>> > HEALTH_WARN 1 requests are blocked > 32 sec; 1 osds have slow requests
>> > 1 ops are blocked > 67108.9 sec
>> > 1 ops are blocked > 67108.9 sec on osd.71
>> > 1 osds have slow requests
>> >
>> You will want to have a very close look at osd.71 (logs, internal
>> counters, cranking up debugging), but might find it just as mysterious as
>> my case in the thread mentioned above.
>>
>> >
>> > My questions are:
>> > (1) Is it normal to have "slow requests" in a cluster?
>> Not really, though the Ceph developers clearly think those just merits a
>> WARNING level, whereas I would consider those a clear sign of brokenness,
>> as VMs or other clients with those requests pending are likely to be
>> unusable at that point.
>>
>> > (2) Or is it a symptom that indicates that something is wrong?
>> >     (for example, a disk is about to fail)
>> That. Of course your cluster could be just at the edge of its performance
>> and nothing but improving that (most likely by adding more nodes/OSDs)
>> would fix that.
>>
>> > (3) How can we fix the "slow requests"?
>> Depends on cause of course.
>> AFTER you exhausted all means and gotten all relevant log/performance data
>> from osd.71 restarting the osd might be all that's needed.
>>
>> > (4) What's the meaning of "blocked ops", and how can they be
>> >     blocked so long? (67000 seconds is more than 18 hours!)
>> Precisely, this shouldn't happen.
>>
>> > (5) How can we fix the "blocked ops"?
>> >
>> AFTER you exhausted all means and gotten all relevant log/performance data
>> from osd.71 restarting the osd might be all that's needed.
>>
>> Christian
>> --
>> Christian Balzer        Network/Systems Engineer
>> chibi at gol.com         Global OnLine Japan/Fusion Communications
>> http://www.gol.com/
>
> --
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZJiKCRDmVDuy+mK58QAAytMP+wUFShHEY3daOTmGLS/e
2kbFIXXmI05Hcz+NrFwr46xeU5Sjwm1cD/dvEansj+bbMcP0D2A9cejSHSRU
EdKpTZn+4PNGOVsUaMcrRaPHuRBe5H4ffbAFAO8sokchjeAGvnYGaRi8+aYt
bMNldg8YeZU3otHNNYVNxrXDLmdWxbZLlklrOt+oEKymKEwhwwASrqitPlss
jBTl8b30Bl2GS0JKOzeEip3MhHlspY8b0mnTVwqB9K9jVm4u6WXbmkppGQBo
NgPgGyBAhgw6GZupd5IeN5wVFC/cnQAjn805J6VC3sgOFHmmq9OT8DCVmoZ7
t0WBdOU76lK0p227/bs7a0aSoh3KRJIZxxg73P9m7oKfn0Q4qZZV/R0QNp7u
6qx+6UEmTd94dRtun3R0cYEl+/mdUmFelwCXTUqDm/I4/rjZVc9VwD3WNjwG
vNjL4b8SsDvSvYMGhRfwQCiWIOTredZHQj/W3QJp/IWEOdhnpayysX/uvUwQ
TPORaub8ecfhGKSxiXxWdVq8h6rL00XukMXlNpAwYegy9HlQL5V4k6NjoS9x
ngjf1UCVVR2t+lkdkL4e5jorIe0t8SWY5+ScIDBLc3t93fv0zsnulc6lWrG+
zUDbHusDzXXTfU9mvC3ZT4Rn+qEX6x4aJkJyhHgmhDmejlP6vkiMgBKM0Djw
yodA
=YmVV
-----END PGP SIGNATURE-----