On Thu, 5 Mar 2020, Dan van der Ster wrote: > Hi all, > > There's something broken in our env when we try to add new mons to > existing clusters, confirmed on two clusters running mimic and > nautilus. It's basically this issue > https://tracker.ceph.com/issues/42830 > > In case something is wrong with our puppet manifests, I'm trying to > doing it manually. > > First we --mkfs the mon and start it, but as soon as the new mon > starts synchronizing, the existing leader becomes unresponsive and an > election is triggered. > > Here's exactly what I'm doing: > > # cd /var/lib/ceph/tmp/ > # scp cephmon1:/var/lib/ceph/tmp/keyring.mon.cephmon1 keyring.mon.cephmon4 > # ceph mon getmap -o monmap > # ceph-mon --mkfs -i cephmon4 --monmap monmap --keyrin > keyring.mon.cephmon4 --setuser ceph --setgroup ceph > # vi /etc/ceph/ceph.conf <add the new mon to ceph.conf like this> > [mon.cephmon4] > host = cephmon4 > mon addr = a.b.c.d:6790 > # systemctl start ceph-mon@cephmon4 > > The log file on the new mon shows it start synchronizing, then > immediately the CPU usage on the leader goes to 100% and elections > start happening, and ceph health shows mon slow ops. perf top of the > ceph-mon with 100% CPU is shown below [1]. > On a small nautilus cluster, the new mon gets added withing a minute > or so (but not cleanly -- the leader is unresponsive for quite awhile > until the new mon joins). debug_mon=20 on the leader doesn't show > anything very interesting. > On our large mimic cluster we tried waiting more than 10 minutes -- > suffering through several mon elections and 100% usage bouncing around > between leaders -- until we gave up. > > I'm pulling my hair out a bit on this -- it's really weird! Can you try running a rocksdb compaction on the existing mons before adding the new one and see if that helps? s > > Did anyone add a new mon to an existing large cluster recently, and it > went smoothly? > > Cheers, Dan > > [1] > > 15.12% ceph-mon [.] > MonitorDBStore::Transaction::encode > 8.95% libceph-common.so.0 [.] > ceph::buffer::v14_2_0::ptr::append > 8.68% libceph-common.so.0 [.] > ceph::buffer::v14_2_0::list::append > 7.69% libceph-common.so.0 [.] > ceph::buffer::v14_2_0::ptr::release > 5.86% libceph-common.so.0 [.] > ceph::buffer::v14_2_0::ptr::ptr > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx