RFI: Prometheus, Etc, Services - Optimum Number To Run

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Hi All,

In regards to the monitoring services on a Ceph Cluster (ie Prometheus, Grafana, Alertmanager, Loki, Node-Exported, Promtail, etc) how many instances should/can we run for fault tolerance purposes? I can't seem to recall that advice being in the doco anywhere (but of course, I probably missed it).

I'm concerned about HA on those services - will they continue to run if the Ceph Node they're on fails?

At the moment we're running only 1 instance of each in the cluster, but several Ceph Nodes are capable of running each - ie/eg 3 nodes configured but only count:1.

This is on the latest version of Reef using cephadmin (if it makes a huge difference :-) ).

So any advice, etc, would be greatly appreciated, including if we should be running any services not mentioned (not Mgr, Mon, OSD, or iSCSI, obviously :-) )


ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux