Re: Slow recovery on Quincy

Chris Palmer <chris.palmer@xxxxxxxxx> · Wed, 17 May 2023 15:48:12 +0100

This is interesting, and it arrived minutes after I had replaced an HDD 
OSD (with NVME DB/WAL) in a small cluster. With the three profiles i was 
only seeing objects/second of around 6-8 (high_client_ops), 9-12 
(balanced), 12-15 (high_recovery_ops). There was only a very light 
client load.

With a bit of ad-hoc experimenting I ended up setting 
osd_mclock_cost_per_byte_usec_hdd to 0.4 which seemed to give about a 
7-8 times increase in objects/second for each profile, without unduly 
affecting client response if I used high_client_ops. Values lower than 
0.4 did speed up backfill, but disk saturation started to become an issue.

These are only very rough-and-ready figures, but they do suggest 
something rather wrong in the calculations.

Chris

On 16/05/2023 19:07, 胡 玮文 wrote:
Hi Sake,

We are experiencing the same. I set “osd_mclock_cost_per_byte_usec_hdd” to 0.1 (default is 2.6) and get about 15 times backfill speed, without significant affect client IO. This parameter seems calculated wrongly, from the description 5e-3 should be a reasonable value for HDD (corresponding to 200MB/s). I noticed this default is originally 5.2, then changed to 2.6 to increase the recovery speed. So I suspect the original author just convert the unit wrongly, he may want 5.2e-3 but wrote 5.2 in code.

But all this may be not important in the next version. I see the relevant code is rewritten, and this parameter is now removed.

high_recovery_ops profile works very poorly for us. It increase the average latency of client IO from 50ms to about 1s.

Weiwen Hu

在 2023年5月16日，19:16，Sake Paulusma <sake1989@xxxxxxxxxxx> 写道：

We noticed extremely slow performance when remapping is necessary. We didn't do anything special other than assigning the correct device_class (to ssd). When checking ceph status, we notice the number of objects recovering is around 17-25 (with watch -n 1 -c ceph status).

How can we increase the recovery process?

There isn't any client load, because we're going to migrate to this cluster in the future, so only a rsync once a while is being executed.

[ceph: root@pwsoel12998 /]# ceph status
  cluster:
    id:     da3ca2e4-ee5b-11ed-8096-0050569e8c3b
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set

  services:
    mon: 5 daemons, quorum pqsoel12997,pqsoel12996,pwsoel12994,pwsoel12998,prghygpl03 (age 3h)
    mgr: pwsoel12998.ylvjcb(active, since 3h), standbys: pqsoel12997.gagpbt
    mds: 4/4 daemons up, 2 standby
    osd: 32 osds: 32 up (since 73m), 32 in (since 6d); 10 remapped pgs
         flags noscrub,nodeep-scrub

  data:
    volumes: 2/2 healthy
    pools:   5 pools, 193 pgs
    objects: 13.97M objects, 853 GiB
    usage:   3.5 TiB used, 12 TiB / 16 TiB avail
    pgs:     755092/55882956 objects misplaced (1.351%)
             183 active+clean
             10  active+remapped+backfilling

  io:
    recovery: 2.3 MiB/s, 20 objects/s

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx