hi there, the dashboard of our moderatly used cluster with 3 mon/mgr-nodes gets stuck about 30 seconds after a mgr becomes active. the dashboard is not usable anymore (ie: the mgr damon does not respond to http requests anymore), although it comes back from the dead occasionally for a few seconds. the same happens to the prometheus module: grafana only shows a few data points here and there. other mgr-related stuff (eg., ceph pg dump) continues to work just fine. forcing a switchover to another mgr or enabling / disabling mgr modules helps for a short while, until the whole gets stuck again. a mgr log with debugging enabled for both mgr and mgrc at level 20 can be found at http://www.user.tu-berlin.de/thoralf.schulze/ceph-mgr-2019113.log.xz - in this case, the hang occurred shortly before 14:55. any hints would be greatly appreciated … thank you very much & with kind regards, thoralf.
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com