On 08/29/2018 11:02 AM, William Lawton wrote: > > We have a 5 node Ceph cluster, status output copied below. During our > cluster resiliency tests we have noted that a MON leader election takes > place when we fail one member of the MON quorum, even though the failed > instance is not the current MON leader. We speculate that this > re-election process may be contributing to short periods of cluster > unavailability when one or more cluster instances fail. Is there a way > to configure the cluster so that there is only a MON leader election if > the existing MON leader fails but not when some other member of the MON > quorum fails? Not at the moment, and this hasn't been in our plans. My reasoning, at least, has been that if a monitor failed, an election is the best way we have to ensure the remaining monitors are alive and communicative. And the election itself should be a quick process anyway, so this never became a particularly pressing feature. I'd suggest opening a feature request in the tracker, asking for this. And, if possible, attach logs to the ticket showing that the election is taking too long, or evidence that you're getting I/O stalls during this period. (for the mon logs, I'd suggest 'debug mon = 10', 'debug paxos = 10', and 'debug ms = 1') -Joao _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com