problem returning mon back to cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



dear ceph users and developers,

on one of our production clusters, we got into pretty unpleasant situation.

After rebooting one of the nodes, when trying to start monitor, whole cluster
seems to hang, including IO, ceph -s etc. When this mon is stopped again,
everything seems to continue. Traying to spawn new monitor leads to the same problem
(even on different node).

I had to give up after minutes of outage, since it's unacceptable. I think we had this
problem once in the past on this cluster, but after some (but much shorter) time, monitor
joined and it worked fine since then.

All cluster nodes are centos 7 machines, I have 3 monitors (so 2 are now running), I'm
using ceph 13.2.6

Network connection seems to be fine.

Anyone seen similar problem? I'd be very grateful for tips on how to debug and solve this..

for those interested, here's log of one of running monitors with debug_mon set to 10/10:

https://storage.lbox.cz/public/d258d0

if I could provide more info, please let me know

with best regards

nikola ciprich







-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux