Re: Discovery (port 8765) service not starting

Matthew Vernon <mvernon@xxxxxxxxxxxxx> · Fri, 6 Sep 2024 10:27:22 +0100

Hi,

On 06/09/2024 08:08, Redouane Kachach wrote:

That makes sense. The ipv6 BUG can lead to the issue you described. In 
the current implementation whenever a mgr failover takes place, 
prometheus configuration (when using the monitoring stack deployed by 
Ceph) is updated automatically to point to the new active mgr. 
Unfortunately it's not easy to have active services running in the 
standby mgr. At most, we can do some redirection as we do in the 
dashboard. 

I've had a little look at the standby mode in the prometheus module; it 
can already figure out the active metrics URL via 
module.get_active_uri(), so it looks like adding an additional 
"redirect" standby_behaviour that emits an HTTPRedirect to the active 
URI wouldn't be too hard. Is that a change you'd be in principle willing 
to accept?

So far we haven't had the need to do that. Next releases will 
come with the new mgmt-gateway service introduced in [1] and [2] which 
will make it easy to have a single entry point to the cluster handling 
HA transparently in the backend.

Interesting, thanks. This would then be configured as a single endpoint 
(which one could presumably redeploy from time to time) that external 
monitoring could be pointed at?

Regards,

Matthew
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx