Hi,
I see the same on different Nautilus clusters, I was pointed to this
tracker issue: https://tracker.ceph.com/issues/39264
In one cluster disabling the prometheus module seemed to have stopped
the failing MGRs. But they happen so rarely that it might be something
different and we just didn't wait long enough. So it seems to be a
reoccuring issue, you could try to see if it occurs with disabled
prometheus mgr module, if you use it, of course.
Just two days ago we had the same thing in another cluster where the
prometheus module is disabled, so there it might be something else
just with similar symptoms.
Regards,
Eugen
Zitat von Gilles Mocellin <gilles.mocellin@xxxxxxxxxxxxxx>:
Hi,
In our Ceph Pacific clusters (16.2.10) (1 for OpenStack and S3, 2
for backup on RBD and S3),
since the upgrade to Pacific, we have regularly the MGR not
responding, not seen anymore in ceph status.
The process is still there.
Noting in the MGR log, just no more logs.
Restarting the service make it come back.
When all MGR are down, we have a warning in ceph status, but not before.
I can't find a similar bug in the Tracker.
Does someone also have that symptom ?
Do you have a workaround or solution ?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx