Hi Michael, Are you using mclock or wpq? Mclock should handle backfills by itself allowing a bigger amount, but is not favoured by many admins due to its limitations for recovery operations. You can crosscheck what profile you are using with osd_mclock_profile and set it to high_recovery_ops at the expense of client traffic. With wpq your have the ability to set osd_max_backfills more precisely - check what value you have currently, it looks like its set to 1. Try increasing if so. Best, Laimis J. laimis.juzeliunas@xxxxxxxxxx > On 14 Jan 2025, at 17:03, Ml Ml <mliebherr99@xxxxxxxxxxxxxx> wrote: > > Hello List, > > i have this 3 node Setup with 17 hdds (new - spinning rust). > > After putting it all together and after resetting the manual > Rack-Placements back to default it seems to recover forever. > I also changed the pg placements at some time. > > The disks where very busy until this morning when i disabled scrubbing > to see if this would speed up the recovery. > > Here is my current status: > ---------------------------------------- > root@ceph12:~# ceph -s > cluster: > id: 2ed9f8fb-1316-4ef1-996d-a3223a3dd594 > health: HEALTH_WARN > noscrub,nodeep-scrub flag(s) set > > services: > mon: 3 daemons, quorum ceph10,ceph11,ceph12 (age 2h) > mgr: ceph11(active, since 3h), standbys: ceph10, ceph12 > osd: 17 osds: 17 up (since 3h), 17 in (since 11d); 18 remapped pgs > flags noscrub,nodeep-scrub > > data: > pools: 2 pools, 257 pgs > objects: 9.62M objects, 37 TiB > usage: 110 TiB used, 134 TiB / 244 TiB avail > pgs: 1534565/28857846 objects misplaced (5.318%) > 239 active+clean > 16 active+remapped+backfill_wait > 2 active+remapped+backfilling > > io: > recovery: 25 MiB/s, 6 objects/s > > I set: > ceph tell 'osd.*' injectargs --osd-max-backfills=6 --osd-recovery-max-active=9 > > My hourly recovery process seems not to really proceed (it decreeses > and increases again): > 5.410 > 5.036 > 5.269 > 5.008 > 5.373 > 5.769 > 5.555 > 5.103 > 5.067 > 5.135 > 5.409 > 5.417 > 5.373 > 5.197 > 5.090 > 5.458 > 5.204 > 5.339 > 5.164 > 5.425 > 5.692 > 5.383 > 5.726 > 5.492 > 6.694 > 6.576 > 6.362 > 6.243 > 6.011 > 5.880 > 5.589 > 5.433 > 5.846 > 5.378 > 5.184 > 5.647 > 5.374 > 5.513 > > root@ceph12:~# ceph osd perf (this is okay for spinning rust i guess) > osd commit_latency(ms) apply_latency(ms) > 17 0 0 > 13 0 0 > 11 57 57 > 9 4 4 > 0 0 0 > 1 79 79 > 14 43 43 > 2 0 0 > 3 0 0 > 16 43 43 > 4 33 33 > 5 0 0 > 6 0 0 > 7 0 0 > 10 0 0 > 12 0 0 > 8 48 48 > > iostat -xt 3 shows that they are busy but its not overloaded as i can tell. > > Any idea why it will only backfill 2? > How could i speed this up? > > My other cluster with 7 nodes and very old hdds and 45 OSDs can make > easily 76 objects/s > > Cheers, > Michael > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx