Yeah the removal of that was annoying for sure. ISTR that one can gather the information from the OSDs’ admin sockets. Envision a Prometheus exporter that polls the admin sockets (in parallel) and Grafana panes that graph slow requests by OSD and by node. > On Mar 13, 2020, at 4:14 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > > For Jewel I wrote a script to take the output of `ceph health detail > --format=json` and send alerts to our system that ordered the osds based on > how long the ops were blocked and which OSDs had the most ops blocked. This > was really helpful to quickly identify which OSD out of a list of 100 would > be the most probable one having issues. Since upgrading to Luminous, I > don't get that and I'm not sure where that info went to. Do I need to query > the manager now? > > This is the regex I was using to extract the pertinent information: > > '^(\d+) ops are blocked > (\d+\.+\d+) sec on osd\.(\d+)$' > > Thanks, > Robert LeBlanc > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx