Re: How to get num ops blocked per OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah the removal of that was annoying for sure.  ISTR that one can gather the information from the OSDs’ admin sockets.

Envision a Prometheus exporter that polls the admin sockets (in parallel) and Grafana panes that graph slow requests by OSD and by node.


> On Mar 13, 2020, at 4:14 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
> 
> For Jewel I wrote a script to take the output of `ceph health detail
> --format=json` and send alerts to our system that ordered the osds based on
> how long the ops were blocked and which OSDs had the most ops blocked. This
> was really helpful to quickly identify which OSD out of a list of 100 would
> be the most probable one having issues. Since upgrading to Luminous, I
> don't get that and I'm not sure where that info went to. Do I need to query
> the manager now?
> 
> This is the regex I was using to extract the pertinent information:
> 
> '^(\d+) ops are blocked > (\d+\.+\d+) sec on osd\.(\d+)$'
> 
> Thanks,
> Robert LeBlanc
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux