I know, unfortunately, this has been an issue for two or three years
now. The first thing I (and many others) suggest if anything stopped
working is to fail the mgr. My impression is that in the past years,
more and more features were added to the mgr while the default configs
haven't changed, causing it to silently fail or at least misbehave. I
created a tracker [0] for one specific issue I saw on a customer
cluster last year. My theory is that due to too low defaults, the mgr
communication between MONs, OSDs, MGRs etc. gets flooded and some
messages get lost. But I haven't found a way to reproduce it in test
clusters yet, so it's still only a theory.
[0] https://tracker.ceph.com/issues/66310
Zitat von Marcus <marcus@xxxxxxxxxx>:
Hi,
Thanks for the tip Eugen!!
I stopped the active systemd mgr so the cluster failed over to
another mgr. After this it all worked fine!
Started the systemd mgr again and it came up as a standby again.
Suppose the mgr got som hickup somehow, did not found any specific
in the log.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx