Re: problem returning mon back to cluster

Harald Staub <harald.staub@xxxxxxxxx> · Tue, 15 Oct 2019 09:14:35 +0200

On 14.10.19 16:31, Nikola Ciprich wrote:
On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote:
Probably same problem here. When I try to add another MON, "ceph
health" becomes mostly unresponsive. One of the existing ceph-mon
processes uses 100% CPU for several minutes. Tried it on 2 test
clusters (14.2.4, 3 MONs, 5 storage nodes with around 2 hdd osds
each). To avoid errors like "lease timeout", I temporarily increase
"mon lease", from 5 to 50 seconds.

Not sure how bad it is from a customer PoV. But it is a problem by
itself to be several minutes without "ceph health", when there is an
increased risk of losing the quorum ...

Hi Harry,

thanks a lot for your reply! not sure we're experiencing the same issue,
i don't have it on any other cluster.. when this is happening to you, does
only ceph health stop working, or it also blocks all clients IO?

Hi Nik

Yes you are right, client I/O is not affected. Also stopping and 
starting an existing MON is ok. But adding a MON without increasing "mon 
lease" as mentioned lead to quorum flapping, so this might be similar.

Cheers
 Harry
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com