Re: problem returning mon back to cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Probably same problem here. When I try to add another MON, "ceph health" becomes mostly unresponsive. One of the existing ceph-mon processes uses 100% CPU for several minutes. Tried it on 2 test clusters (14.2.4, 3 MONs, 5 storage nodes with around 2 hdd osds each). To avoid errors like "lease timeout", I temporarily increase "mon lease", from 5 to 50 seconds.

Not sure how bad it is from a customer PoV. But it is a problem by itself to be several minutes without "ceph health", when there is an increased risk of losing the quorum ...

 Harry

On 13.10.19 20:26, Nikola Ciprich wrote:
dear ceph users and developers,

on one of our production clusters, we got into pretty unpleasant situation.

After rebooting one of the nodes, when trying to start monitor, whole cluster
seems to hang, including IO, ceph -s etc. When this mon is stopped again,
everything seems to continue. Traying to spawn new monitor leads to the same problem
(even on different node).

I had to give up after minutes of outage, since it's unacceptable. I think we had this
problem once in the past on this cluster, but after some (but much shorter) time, monitor
joined and it worked fine since then.

All cluster nodes are centos 7 machines, I have 3 monitors (so 2 are now running), I'm
using ceph 13.2.6

Network connection seems to be fine.

Anyone seen similar problem? I'd be very grateful for tips on how to debug and solve this..

for those interested, here's log of one of running monitors with debug_mon set to 10/10:

https://storage.lbox.cz/public/d258d0

if I could provide more info, please let me know

with best regards

nikola ciprich







_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux