Re: Any idea why misplace recovery wont finish?

Laimis Juzeliūnas <laimis.juzeliunas@xxxxxxxxxx> · Tue, 14 Jan 2025 17:09:18 +0200

Hi Michael,

Are you using mclock or wpq? Mclock should handle backfills by itself allowing a bigger amount, but is not favoured by many admins due to its limitations for recovery operations. You can crosscheck what profile you are using with osd_mclock_profile and set it to high_recovery_ops at the expense of client traffic.
With wpq your have the ability to set osd_max_backfills more precisely - check what value you have currently, it looks like its set to 1. Try increasing if so.

Best,
Laimis J.
laimis.juzeliunas@xxxxxxxxxx

> On 14 Jan 2025, at 17:03, Ml Ml <mliebherr99@xxxxxxxxxxxxxx> wrote:
> 
> Hello List,
> 
> i have this 3 node Setup with 17 hdds (new - spinning rust).
> 
> After putting it all together and after resetting the manual
> Rack-Placements back to default it seems to recover forever.
> I also changed the pg placements at some time.
> 
> The disks where very busy until this morning when i disabled scrubbing
> to see if this would speed up the recovery.
> 
> Here is my current status:
> ----------------------------------------
> root@ceph12:~# ceph -s
>  cluster:
>    id:     2ed9f8fb-1316-4ef1-996d-a3223a3dd594
>    health: HEALTH_WARN
>            noscrub,nodeep-scrub flag(s) set
> 
>  services:
>    mon: 3 daemons, quorum ceph10,ceph11,ceph12 (age 2h)
>    mgr: ceph11(active, since 3h), standbys: ceph10, ceph12
>    osd: 17 osds: 17 up (since 3h), 17 in (since 11d); 18 remapped pgs
>         flags noscrub,nodeep-scrub
> 
>  data:
>    pools:   2 pools, 257 pgs
>    objects: 9.62M objects, 37 TiB
>    usage:   110 TiB used, 134 TiB / 244 TiB avail
>    pgs:     1534565/28857846 objects misplaced (5.318%)
>             239 active+clean
>             16  active+remapped+backfill_wait
>             2   active+remapped+backfilling
> 
>  io:
>    recovery: 25 MiB/s, 6 objects/s
> 
> I set:
> ceph tell 'osd.*' injectargs --osd-max-backfills=6 --osd-recovery-max-active=9
> 
> My hourly recovery process seems not to really proceed (it decreeses
> and increases again):
> 5.410
> 5.036
> 5.269
> 5.008
> 5.373
> 5.769
> 5.555
> 5.103
> 5.067
> 5.135
> 5.409
> 5.417
> 5.373
> 5.197
> 5.090
> 5.458
> 5.204
> 5.339
> 5.164
> 5.425
> 5.692
> 5.383
> 5.726
> 5.492
> 6.694
> 6.576
> 6.362
> 6.243
> 6.011
> 5.880
> 5.589
> 5.433
> 5.846
> 5.378
> 5.184
> 5.647
> 5.374
> 5.513
> 
> root@ceph12:~# ceph osd perf (this is okay for spinning rust i guess)
> osd  commit_latency(ms)  apply_latency(ms)
> 17                   0                  0
> 13                   0                  0
> 11                  57                 57
>  9                   4                  4
>  0                   0                  0
>  1                  79                 79
> 14                  43                 43
>  2                   0                  0
>  3                   0                  0
> 16                  43                 43
>  4                  33                 33
>  5                   0                  0
>  6                   0                  0
>  7                   0                  0
> 10                   0                  0
> 12                   0                  0
>  8                  48                 48
> 
> iostat -xt 3 shows that they are busy but its not overloaded as i can tell.
> 
> Any idea why it will only backfill 2?
> How could i speed this up?
> 
> My other cluster with 7 nodes and very old hdds and 45 OSDs can make
> easily 76 objects/s
> 
> Cheers,
> Michael
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx