Re: in retrospect get OSD for "slow requests are blocked" ? / get detailed health status via librados?

Uwe Sauter <uwe.sauter.de@xxxxxxxxx> · Wed, 16 May 2018 13:18:49 +0200

Hi Mohamad,

>> I'm currently chewing on an issue regarding "slow requests are blocked". I'd like to identify the OSD that is causing those events
>> once the cluster is back to HEALTH_OK (as I have no monitoring yet that would get this info in realtime).
>>
>> Collecting this information could help identify aging disks if you were able to accumulate and analyze which OSD had blocking
>> requests in the past and how often those events occur.
>>
>> My research so far let's me think that this information is only available as long as the requests are actually blocked. Is this
>> correct?
> 
> I think this is what you're looking for:
> 
> $> ceph daemon osd.X dump_historic_slow_ops
> 
> which gives you recent slow operations, as opposed to
> 
> $> ceph daemon osd.X dump_blocked_ops
> 
> which returns current blocked operations. You can also add a filter to
> those commands.

Thanks for these commands. I'll have a look into those. If I understand these correctly it means that I need to run these at each
server for each OSD instead of at a central location, is that correct?

Regards,

	Uwe
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com