Ceph metric exporter HTTP Error 500

Falk Mueller-Braun <fmuelle4@xxxxxxx> · Fri, 15 Dec 2017 11:53:47 +0100



    Hello, 

    
    since we upgraded to Luminous (12.2.2), we use the internal Ceph
      exporter for getting the Ceph metrics to Prometheus. At random
      times we get a Internal Server Error from the Ceph exporter, with
      python having a key error with some random metric. Often it is
      "pg_*".

    
    Here is an example of the python exception: 

    
      Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond
    response.body = self.handler()
  File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/usr/lib/ceph/mgr/prometheus/module.py", line 386, in metrics
    metrics = global_instance().collect()
  File "/usr/lib/ceph/mgr/prometheus/module.py", line 324, in collect
    self.get_pg_status()
  File "/usr/lib/ceph/mgr/prometheus/module.py", line 266, in get_pg_status
    self.metrics[path].set(value)
KeyError: 'pg_deep'

    
     After a certain time (could be 3-5 minutes oder sometimes even
      40 minutes), the metric sending starts working again without any
      help.
    

      Has anyone got an idea what could be done about that or does
      experience similar problems?
    Thanks,

      Falk

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com