Hello, recently we wanted to re-adjust rebalancing speed in one cluster with ceph tell osd.* injectargs '--osd-max-backfills 4' ceph tell osd.* injectargs '--osd-recovery-max-active 4' The first osds responded and after about 6-7 osds ceph tell stopped progressing, just after it encountered a dead osd (osd.10). We have since then removed osd.10 and all osds in the cluster are up. However as soon as we issue either of the above tell commands, it just hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in "Activating" and "Peering" states. It seems to be related, as soon as we stop ceph tell (ctrl-c it), a few minutes later the pgs are peered/active. We can reproduce this problem also with very busy osds, which have been moved to another host - they also do not react to the ceph tell commands. We are mostly on 14.2.9, besides the rgw: [16:44:47] black2.place6:~# ceph versions { "mon": { "ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 3 }, "osd": { "ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 85 }, "mds": {}, "rgw": { "ceph version 20200428-923-g4004f081ec (4004f081ec047d60e84d76c2dad6f31e2ac44484) nautilus (stable)": 1 }, "overall": { "ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 91, "ceph version 20200428-923-g4004f081ec (4004f081ec047d60e84d76c2dad6f31e2ac44484) nautilus (stable)": 1 } } Did anyone see this before and/or do you have a hint on how to debug ceph tell as it is not a daemon on its own? Best regards, Nico -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx