Hi All,
In regards to the monitoring services on a Ceph Cluster (ie Prometheus,
Grafana, Alertmanager, Loki, Node-Exported, Promtail, etc) how many
instances should/can we run for fault tolerance purposes? I can't seem
to recall that advice being in the doco anywhere (but of course, I
probably missed it).
I'm concerned about HA on those services - will they continue to run if
the Ceph Node they're on fails?
At the moment we're running only 1 instance of each in the cluster, but
several Ceph Nodes are capable of running each - ie/eg 3 nodes
configured but only count:1.
This is on the latest version of Reef using cephadmin (if it makes a
huge difference :-) ).
So any advice, etc, would be greatly appreciated, including if we should
be running any services not mentioned (not Mgr, Mon, OSD, or iSCSI,
obviously :-) )
Cheers
Dulux-Oz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx