Re: How to ... alertmanager and prometheus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

the only information I found so far was this statement from the redhat docs [1]:

When multiple services of the same type are deployed, a highly-available setup is deployed.

I tried to do that in a virtual test environment (16.2.7) and it seems to work as expected.

ses7-host1:~ # ceph orch ps --daemon_type prometheus
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID prometheus.ses7-host1 ses7-host1 running (6h) 12s ago 12M 165M - 2.18.0 8eb9f2694232 04a0b33e2474 prometheus.ses7-host2 ses7-host2 *:9095 host is offline 89s ago 6h 236M - 8eb9f2694232 0cb070cea4eb

host2 was the active mgr before I shut it down, but I still have access to prometheus metrics as well as active alerts from alertmanager, there's also one spare instance running, the same applies for grafana:

ses7-host1:~ # ceph orch ps --daemon_type alertmanager
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID alertmanager.ses7-host1 ses7-host1 running (6h) 42s ago 12M 33.7M - 0.16.2 903e9b49157e 5a4ffc9a79da alertmanager.ses7-host2 ses7-host2 *:9093,9094 running (102s) 44s ago 6h 35.5M - 903e9b49157e 71ac3c636a6b

ses7-host1:~ # ceph orch ps --daemon_type prometheus
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID prometheus.ses7-host1 ses7-host1 running (6h) 44s ago 12M 156M - 2.18.0 8eb9f2694232 04a0b33e2474 prometheus.ses7-host2 ses7-host2 *:9095 running (104s) 47s ago 6h 250M - 8eb9f2694232 87a5a8349f05

ses7-host1:~ # ceph orch ps --daemon_type grafana
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID grafana.ses7-host1 ses7-host1 running (6h) 47s ago 12M 99.6M - 7.1.5 31b52dc794e2 7935ecf47b38 grafana.ses7-host2 ses7-host2 *:3000 running (107s) 49s ago 6h 108M - 7.1.5 31b52dc794e2 17dea034bb33

I just specified two hosts in the placement section of each service and deployed them. I think this should be mentioned in the ceph docs (not only redhat).

[1] https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html/operations_guide/management-of-monitoring-stack-using-the-ceph-orchestrator

Zitat von Michael Lipp <mnl@xxxxxx>:

Hi,

I've just setup a test cluster with cephadm using quincy. Things work nicely. However, I'm not sure how to "handle" alertmanager and prometheus.

Both services obviously aren't crucial to the working of the storage, fine. But there seems to be no built-in fall-over concept.

By default, the active mgr accesses the services using host.containers.local, thus assuming that they run an the same machine as the active manager. This assumption is true after the initial installation. Turning off the host with the active manager activates the stand-by on another machine, but alertmanager and prometheus are gone (i.e. not "moved along"). So the active manager produces lots of error messages when logging into it. Turning the tuned-off machine on again doesn't help, because alertmanager and prometheus are back, but on the wrong machine.

I couldn't find anything in the documentation. Are alertmanager and prometheus supposed to run in some HA-VM? Then I could add the HA-VM to the cluster with (only) these two services running on it and make the URIs point to this HA-VM (ceph dashboard set-alertmanager-api-host ..., ceph dashboard set-grafana-api-url ...,  ceph dashboard set-prometheus-api-host...).

How is this supposed to be configured?

 - Michael


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux