Re: Prometheus endpoint hanging with 13.2.7 release?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 09, 2019 at 05:01:04PM -0800, Paul Choi wrote:
>   Hello,
>   Anybody seeing the Prometheus endpoint hanging with the new 13.2.7
>   release?
>   With 13.2.6 the endpoint would respond with a payload of 15MB in less
>   than 10 seconds.

I'd guess its not the prometheus module itself:
$ git  diff v13.2.6 v13.2.7 src/pybind/mgr/prometheus
diff --git a/src/pybind/mgr/prometheus/module.py 
b/src/pybind/mgr/prometheus/module.py
index 2d4472434a..3a398e0b0c 100644
--- a/src/pybind/mgr/prometheus/module.py
+++ b/src/pybind/mgr/prometheus/module.py
@@ -142,7 +142,8 @@ class Metric(object):

          def promethize(path):
              ''' replace illegal metric name characters '''
-            result = path.replace('.', '_').replace('+', '_plus').replace('::', '_')
+            result = path.replace('.', '_').replace(
+                '+', '_plus').replace('::', '_').replace(' ', '_')

              # Hyphens usually turn into underscores, unless they are
              # trailing
@@ -720,7 +721,8 @@ class Module(MgrModule):
                      raise cherrypy.HTTPError(503, 'No MON connection')

          # Make the cache timeout for collecting configurable
-        self.collect_timeout = self.get_localized_config('scrape_interval', 5.0)
+        self.collect_timeout = float(self.get_localized_config(
+            'scrape_interval', 5.0))

          server_addr = self.get_localized_config('server_addr', DEFAULT_ADDR)
          server_port = self.get_localized_config('server_port', DEFAULT_PORT)

So the mgr would be a likely suspect. If you could open a tracker ticket, 
ideally with mgr debug logs attached, this can be looked at.

>   Now, if you restart ceph-mgr, the Prometheus endpoint responds quickly
>   for the first run, then successive runs get slower and slower, until it
>   takes several minutes.
>   I have no customization for the mgr module. Except for the Prometheus
>   module, the "status" module and "Zabbix" module are working fine.
>   This is on Ubuntu 16 LTS:
>   ii  ceph-mgr                             13.2.7-1xenial
>               amd64        manager for the ceph distributed storage
>   system
>   I'd love to know if there's a way to diagnose this issue - I tried
>   upping the debug ms level but that doesn't seem to yield useful
>   information.
>   I don't know if this useful, but "prometheus self-test" is fine too.
>   $ sudo ceph tell mgr.0 prometheus self-test
>   Self-test OK
>   pchoi@pchoi-desktop:~$ ceph mgr module ls
>   {
>       "enabled_modules": [
>           "prometheus",
>           "status",
>           "zabbix"
>       ],
>       "disabled_modules": [
>           {
>               "name": "balancer",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "crash",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "dashboard",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "hello",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "influx",
>               "can_run": false,
>               "error_string": "influxdb python module not found"
>           },
>           {
>               "name": "iostat",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "localpool",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "restful",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "selftest",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "smart",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "telegraf",
>               "can_run": true,
>               "error_string": ""
>           },
>           {
>               "name": "telemetry",
>               "can_run": true,
>               "error_string": ""
>           }
>       ]
>   }
>   pchoi@pchoi-desktop:~$ ceph mgr services
>   {
>       "prometheus": "[1]http://woodenbox2:9283/";
>   }
>
>References
>
>   1. http://woodenbox2:9283/

>_______________________________________________
>ceph-users mailing list -- ceph-users@xxxxxxx
>To unsubscribe send an email to ceph-users-leave@xxxxxxx


-- 
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux