Blocked requests/ops?

xserrano+ceph@xxxxxxxxxx (Xavier Serrano) · Tue, 26 May 2015 12:06:29 +0200

Hello,

Thanks for your detailed explanation, and for the pointer to the
"Unexplainable slow request" thread.

After investigating osd logs, disk SMART status, etc., the disk under
osd.71 seems OK, so we restarted the osd... And voil?, problem seems
to be solved! (or at least, the "slow request" message disappeared).

But this really does not make me happy (and neither are you, Christian,
I'm afraid). I understand that it is not acceptable that sometimes,
apparently randomly, slow requests do happen and they remain stuck until
an operator manually restarts the affected osd.

My question now is: did you file a bug to ceph developers?
What did they say? Could you provide me the links? I would like
to reopen the issue if possible, and see if we can find a
solution for this.

About our cluster (testing, not production):
 - ceph version 0.94.1
 - all hosts running Ubuntu 14.04 LTS 64-bits, kernel 3.16
 - 5 monitors, 128GB RAM each
 - 6 osd hosts, 32GB RAM each, 20 osds per host, 1 HDD WD Green 2TB per osd
 - (and 6 more osds host to arrive soon)
 - 10 GbE interconnection

Thank you very much indeed.
Best regards,
- Xavier Serrano
- LCAC, Laboratori de C?lcul
- Departament d'Arquitectura de Computadors, UPC

On Tue May 26 14:19:22 2015, Christian Balzer wrote:

> 
> Hello,
> 
> Firstly, find my "Unexplainable slow request" thread in the ML archives
> and read all of it.
> 
> On Tue, 26 May 2015 07:05:36 +0200 Xavier Serrano wrote:
> 
> > Hello,
> > 
> > We have observed that our cluster is often moving back and forth
> > from HEALTH_OK to HEALTH_WARN states due to "blocked requests".
> > We have also observed "blocked ops". For instance:
> > 
> As always SW versions and a detailed HW description (down to the model of
> HDDs used) will be helpful and educational.
> 
> > # ceph status
> >     cluster 905a1185-b4f0-4664-b881-f0ad2d8be964
> >      health HEALTH_WARN
> >             1 requests are blocked > 32 sec
> >      monmap e5: 5 mons at
> > {ceph-host-1=192.168.0.65:6789/0,ceph-host-2=192.168.0.66:6789/0,ceph-host-3=192.168.0.67:6789/0,ceph-host-4=192.168.0.68:6789/0,ceph-host-5=192.168.0.69:6789/0}
> > election epoch 44, quorum 0,1,2,3,4
> > ceph-host-1,ceph-host-2,ceph-host-3,ceph-host-4,ceph-host-5 osdmap
> > e5091: 120 osds: 100 up, 100 in pgmap v473436: 2048 pgs, 2 pools, 4373
> > GB data, 1093 kobjects 13164 GB used, 168 TB / 181 TB avail 2048
> > active+clean client io 10574 kB/s rd, 33883 kB/s wr, 655 op/s
> > 
> > # ceph health detail
> > HEALTH_WARN 1 requests are blocked > 32 sec; 1 osds have slow requests
> > 1 ops are blocked > 67108.9 sec
> > 1 ops are blocked > 67108.9 sec on osd.71
> > 1 osds have slow requests
> > 
> You will want to have a very close look at osd.71 (logs, internal
> counters, cranking up debugging), but might find it just as mysterious as
> my case in the thread mentioned above.
> 
> > 
> > My questions are:
> > (1) Is it normal to have "slow requests" in a cluster?
> Not really, though the Ceph developers clearly think those just merits a
> WARNING level, whereas I would consider those a clear sign of brokenness,
> as VMs or other clients with those requests pending are likely to be
> unusable at that point.
> 
> > (2) Or is it a symptom that indicates that something is wrong?
> >     (for example, a disk is about to fail)
> That. Of course your cluster could be just at the edge of its performance
> and nothing but improving that (most likely by adding more nodes/OSDs)
> would fix that.
> 
> > (3) How can we fix the "slow requests"?
> Depends on cause of course.
> AFTER you exhausted all means and gotten all relevant log/performance data
> from osd.71 restarting the osd might be all that's needed.
> 
> > (4) What's the meaning of "blocked ops", and how can they be
> >     blocked so long? (67000 seconds is more than 18 hours!)
> Precisely, this shouldn't happen.
> 
> > (5) How can we fix the "blocked ops"?
> > 
> AFTER you exhausted all means and gotten all relevant log/performance data
> from osd.71 restarting the osd might be all that's needed.
> 
> Christian
> -- 
> Christian Balzer        Network/Systems Engineer                
> chibi at gol.com   	Global OnLine Japan/Fusion Communications
> http://www.gol.com/

--