Re: How to ... alertmanager and prometheus

Eugen Block <eblock@xxxxxx> · Tue, 08 Nov 2022 14:56:54 +0000

Hi,

the only information I found so far was this statement from the redhat  
docs [1]:

When multiple services of the same type are deployed, a  
highly-available setup is deployed.

I tried to do that in a virtual test environment (16.2.7) and it seems  
to work as expected.

ses7-host1:~ # ceph orch ps --daemon_type prometheus
NAME                   HOST        PORTS   STATUS           REFRESHED   
AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
prometheus.ses7-host1  ses7-host1          running (6h)       12s ago   
12M     165M        -  2.18.0   8eb9f2694232  04a0b33e2474
prometheus.ses7-host2  ses7-host2  *:9095  host is offline    89s ago   
 6h     236M        -           8eb9f2694232  0cb070cea4eb

host2 was the active mgr before I shut it down, but I still have  
access to prometheus metrics as well as active alerts from  
alertmanager, there's also one spare instance running, the same  
applies for grafana:

ses7-host1:~ # ceph orch ps --daemon_type alertmanager
NAME                     HOST        PORTS        STATUS           
REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
alertmanager.ses7-host1  ses7-host1               running (6h)       
42s ago  12M    33.7M        -  0.16.2   903e9b49157e  5a4ffc9a79da
alertmanager.ses7-host2  ses7-host2  *:9093,9094  running (102s)     
44s ago   6h    35.5M        -           903e9b49157e  71ac3c636a6b

ses7-host1:~ # ceph orch ps --daemon_type prometheus
NAME                   HOST        PORTS   STATUS          REFRESHED   
AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
prometheus.ses7-host1  ses7-host1          running (6h)      44s ago   
12M     156M        -  2.18.0   8eb9f2694232  04a0b33e2474
prometheus.ses7-host2  ses7-host2  *:9095  running (104s)    47s ago    
6h     250M        -           8eb9f2694232  87a5a8349f05

ses7-host1:~ # ceph orch ps --daemon_type grafana
NAME                HOST        PORTS   STATUS          REFRESHED  AGE  
 MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
grafana.ses7-host1  ses7-host1          running (6h)      47s ago  12M  
   99.6M        -  7.1.5    31b52dc794e2  7935ecf47b38
grafana.ses7-host2  ses7-host2  *:3000  running (107s)    49s ago   6h  
    108M        -  7.1.5    31b52dc794e2  17dea034bb33

I just specified two hosts in the placement section of each service  
and deployed them. I think this should be mentioned in the ceph docs  
(not only redhat).

[1]  
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html/operations_guide/management-of-monitoring-stack-using-the-ceph-orchestrator

Zitat von Michael Lipp <mnl@xxxxxx>:

Hi,

I've just setup a test cluster with cephadm using quincy. Things  
work nicely. However, I'm not sure how to "handle" alertmanager and  
prometheus.

Both services obviously aren't crucial to the working of the  
storage, fine. But there seems to be no built-in fall-over concept.

By default, the active mgr accesses the services using  
host.containers.local, thus assuming that they run an the same  
machine as the active manager. This assumption is true after the  
initial installation. Turning off the host with the active manager  
activates the stand-by on another machine, but alertmanager and  
prometheus are gone (i.e. not "moved along"). So the active manager  
produces lots of error messages when logging into it. Turning the  
tuned-off machine on again doesn't help, because alertmanager and  
prometheus are back, but on the wrong machine.

I couldn't find anything in the documentation. Are alertmanager and  
prometheus supposed to run in some HA-VM? Then I could add the HA-VM  
to the cluster with (only) these two services running on it and make  
the URIs point to this HA-VM (ceph dashboard  
set-alertmanager-api-host ..., ceph dashboard set-grafana-api-url  
...,  ceph dashboard set-prometheus-api-host...).

How is this supposed to be configured?

 - Michael

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx