Hi Chris, While we look into this, I have a couple of questions: 1. Did the recovery rate stay at 1 object/sec throughout? In our tests we have seen that the rate is higher during the starting phase of recovery and eventually tapers off due to throttling by mclock. 2. Can you try speeding up the recovery by changing to "high_recovery_ops" profile on all the OSDs to see if it improves things (both CPU load and recovery rate)? 3. On the OSDs that showed high CPU usage, can you run the following command and revert back? This just dumps the mclock settings on the OSDs. sudo ceph daemon osd.N config show | grep osd_mclock I will update the tracker with these questions as well so that the discussion can continue there. Thanks, -Sridhar On Tue, Jul 12, 2022 at 4:49 PM Chris Palmer <chris.palmer@xxxxxxxxx> wrote: > I've created tracker https://tracker.ceph.com/issues/56530 for this, > including info on replicating it on another cluster. > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx