Re: prometheus - figure out which mgr (metrics endpoint) that is active

David Orman <ormandj@xxxxxxxxxxxx> · Tue, 28 Sep 2021 15:38:06 -0500

We scrape all mgr endpoints since we use external Prometheus clusters,
as well. The query results will have {instance=activemgrhost}. The
dashboards in upstream don't have multiple cluster support, so we have
to modify them to work with our deployments since we have multiple
ceph clusters being polled by Prometheus clusters. We effectively add
instance regular expressions to all the queries on the dashboards, and
a variable for the dashboard itself, to support getting the list of
clusters via a label_values call on one of the ceph_exporter metrics +
regular expression to parse out the part after the hostname portion of
the fqdn.

I don't think the current dashboards are intended for use outside the
internal Prometheus deployments, but we definitely intended (at some
point when time permitted) to try and submit patches that would work
for both use-cases, since it's painful to continually update the
dashboards on every release.

On Tue, Sep 28, 2021 at 12:45 PM Karsten Nielsen <karsten@xxxxxxxxxx> wrote:
>
> Hi,
> I am running ceph 16.2.6 installed with cephadm.
> I have enabled prometheus to be able scrape metrics from an external
> promethus server.
> I have 3 nodes with mgr daeamon all reply to the query against
> node:9283/metrics 2 is returning a empty reply - the none active mgr's.
> Is there a node:9283/health or other path to query for the once that is
> not active ?
> I am asking as I am getting empty dashboards 2 of 3 times as there are
> no metrics when the wrong endpoint is getting scraped.
>
> Thanks,
> - Karsten
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx