Hi, This sounds a lot like the negative progress bug we just found last week: https://tracker.ceph.com/issues/50591 That bug makes the mon enter a very long loop rendering a progress bar if the mgr incorrectly sends a message to the mon that the progress is negative. Octopus and later don't have this loop so don't have this bug. Could you set debug_mgr = 4/5 then check the mgr log for something like this? mgr[progress] Updated progress to -0.333333333333 (Rebalancing after osd... marked in) Cheers, Dan On Tue, May 4, 2021 at 4:10 PM Rainer Krienke <krienke@xxxxxxxxxxxxxx> wrote: > > Hello, > > I am playing around with a test ceph 14.2.20 cluster. The cluster > consists of 4 VMs, each VM has 2 OSDs. The first three VMs vceph1, > vceph2 and vceph3 are monitors. vceph1 is also mgr. > > What I did was quite simple. The cluster is in the state HEALTHY: > > vceph2: systemctl stop ceph-osd@2 > # let ceph repair until ceph -s reports cluster is healthy again > > vceph2: systemctl start ceph-osd@2 # @ 15:39:15, for the logs > # cluster reports in cephs -s that 8 OSDs are up and in, then > # starts rebalance osd.2 > > vceph2: ceph -s # hangs forever also if executed on vceph3 or 4 > # mon on vceph1 eats 100% CPU permanently, the other mons ~0 %CPU > > vceph1: systemctl stop ceph-mon@vceph1 # wait ~30 sec to terminate > vceph1: systemctl start ceph-mon@vceph1 # Everything is OK again > > I posted the mon-log to: https://cloud.uni-koblenz.de/s/t8tWjWFAobZb5Hy > > Strange enough if I set "debug mon 20" before starting the experiment > this bug does not show up. I also tried the very same procedure on the > same cluster updated to 15.2.11 but I was unable to reproduce this bug > in this ceph version. > > Thanks > Rainer > -- > Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1 > 56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312 > PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 > 1001312 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx