Hi Matthew, That makes sense. The ipv6 BUG can lead to the issue you described. In the current implementation whenever a mgr failover takes place, prometheus configuration (when using the monitoring stack deployed by Ceph) is updated automatically to point to the new active mgr. Unfortunately it's not easy to have active services running in the standby mgr. At most, we can do some redirection as we do in the dashboard. So far we haven't had the need to do that. Next releases will come with the new mgmt-gateway service introduced in [1] and [2] which will make it easy to have a single entry point to the cluster handling HA transparently in the backend. This is still WIP but you can play with it if you want using the latest code from main. Support for OIDC based on oauth2-proxy is also being introduced as part of this effort by [3]. @ Timo Holloway, as I said the support [4] for service discovery has been there for a while (I'd say 2 years aprox) unless you are using an old Ceph version (where the prometheus config was static) you should see traffic in the port 8765. [1] https://github.com/ceph/ceph/pull/57535 [2] https://github.com/ceph/ceph/pull/58402 [3] https://github.com/ceph/ceph/pull/58460 [4] https://github.com/ceph/ceph/pull/46400 On Thu, Sep 5, 2024 at 7:00 PM Tim Holloway <timh@xxxxxxxxxxxxx> wrote: > Now you've got me worried. As I said, there is absolutely no traffic > using port 8765 on my LAN. > > Am I missing a service? Since my distro is based on stock Prometheus, > I'd have to assume that the port 8765 server would be part of the Ceph > generic container image and isn't being switched on for some reason. > > Tim > > On Thu, 2024-09-05 at 15:05 +0100, Matthew Vernon wrote: > > On 05/09/2024 15:03, Matthew Vernon wrote: > > > Hi, > > > > > > On 05/09/2024 12:49, Redouane Kachach wrote: > > > > > > > The port 8765 is the "service discovery" (an internal server that > > > > runs in > > > > the mgr... you can change the port by changing the > > > > variable service_discovery_port of cephadm). Normally it is > > > > opened in the > > > > active mgr and the service is used by prometheus (server) to get > > > > the > > > > targets by using the http service discovery feature [1]. This > > > > feature has > > > > been there for a long time now and it's the default configuration > > > > used by > > > > Ceph monitoring stack. It should start automatically without any > > > > external > > > > intervention (or manual configuration). > > > > > > Right; it wasn't running because I have an IPv6 deployment (that > > > bug's > > > fixed in 18.2.4 - https://tracker.ceph.com/issues/63448). > > > > ...though I'm not sure that having only the active mgr run this > > endpoint > > is correct, though? Isn't it more useful to be able to e.g. point my > > Prometheus at any of the mgrs and have service discovery work, rather > > than needing Prometheus to know which mgr is active to know which sd > > to > > talk to, which seems to rather defeat the point? > > > > Thanks, > > > > Matthew > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx