At the moment I've found that the mgr daemon works fine when I move it to an OSD node. All nodes have the same OS version, so I can conclude that the problem is limited to the nodes that normally run mgr. I'm still investigating what's happening, but at least I got the monitoring back.
Regards.
On Tue, Jun 4, 2024 at 4:01 PM Dario Graña <dgrana@xxxxxx> wrote:
Hi all!I'm running ceph quincy 17.2.7 in a cluster. On monday I updated the OS to AlmaLinux 9.3 to 9.4, since then grafana shows "No Data" message in all ceph related fields but, for example, the nodes information is still fine (Host Detail Dashboard).I have redeployed the mgr service with cephadm, disabled and re-enabled mgr prometheus module , but nothing changed. Digging into the problem, I accessed the prometheus interface. When I access prometheus, and found this errorWhen I access the node shown as down, it reports
503 Service Unavailable
No cached data available yet
Traceback (most recent call last): File "/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 638, in respond self._do_respond(path_info) File "/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 697, in _do_respond response.body = self.handler() File "/lib/python3.6/site-packages/cherrypy/lib/encoding.py", line 219, in __call__ self.body = self.oldhandler(*args, **kwargs) File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/share/ceph/mgr/prometheus/module.py", line 1751, in metrics return self._metrics(_global_instance) File "/usr/share/ceph/mgr/prometheus/module.py", line 1762, in _metrics raise cherrypy.HTTPError(503, 'No cached data available yet') cherrypy._cperror.HTTPError: (503, 'No cached data available yet')I checked the mgr prometheus address and port[ceph: root@ceph-admin01 /]# ceph config get mgr mgr/prometheus/server_addr
::
[ceph: root@ceph-admin01 /]# ceph config get mgr mgr/prometheus/server_port
9283It seems to be ok.When I check the master manager node for the port, I found[root@ceph-hn01 ~]# netstat -natup | grep 9283
tcp6 0 0 :::9283 :::* LISTEN 2453/ceph-mgr
tcp6 0 0 192.168.97.51:9283 192.168.97.60:36130 ESTABLISHED 2453/ceph-mgrI don't understand why it is showing as IPv6, the node doesn't have a dual stack.I also tried to use a newer version of the prometheus container image, the 1.6.0, but it keeps reporting the same, so I rolled it back to the original one.Has anyone experienced an issue like this?Where can I look for more information about it?Thanks in advance.Regards.--Dario GrañaPIC (Port d'Informació Científica)
Campus UAB, Edificio D
E-08193 Bellaterra, Barcelona
http://www.pic.esAvis - Aviso - Legal Notice: http://legal.ifae.es
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx