Hello everyone, we encountered an error with the Prometheus plugin for Ceph mgr: One osd was down and (therefore) it had no class: ``` sudo ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF 28 hdd 7.27539 osd.28 up 1.00000 1.00000 6 0 osd.6 down 0 1.00000 ``` When we tried to curl the metrics, there was an error because the osd had no class (see below "KeyError: 'class' "). Anybody experience the same? Isn't this an error on the Prometheus plugin's behalf? When an osd is down, the plugin should not stop working imo. ``` ~> curl -v 127.0.0.1:9283/metrics * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 9283 (#0) > GET /metrics HTTP/1.1 > Host: 127.0.0.1:9283 > User-Agent: curl/7.47.0 > Accept: */* > < HTTP/1.1 500 Internal Server Error < Date: Wed, 14 Nov 2018 13:59:59 GMT < Content-Length: 1663 < Content-Type: text/html;charset=utf-8 < Server: CherryPy/3.5.0 < <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta> <title>500 Internal Server Error</title> <style type="text/css"> #powered_by { margin-top: 20px; border-top: 2px solid black; font-style: italic; } #traceback { color: red; } </style> </head> <body> <h2>500 Internal Server Error</h2> <p>The server encountered an unexpected condition which prevented it from fulfilling the request.</p> <pre id="traceback">Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond response.body = self.handler() File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in __call__ self.body = self.oldhandler(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line 414, in metrics metrics = global_instance().collect() File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line 351, in collect self.get_metadata_and_osd_status() File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line 310, in get_metadata_and_osd_status dev_class['class'], KeyError: 'class' </pre> <div id="powered_by"> <span> Powered by <a href="http://www.cherrypy.org">CherryPy 3.5.0</a> </span> </div> </body> </html> * Connection #0 to host 127.0.0.1 left intact ``` Kind regards, Gökhan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com