Re: Wide EC pool causes very slow backfill?

Torkil Svensgaard <torkil@xxxxxxxx> · Thu, 18 Jan 2024 15:19:56 +0100

Nice improvement with wpq:

"
  data:
    volumes: 1/1 healthy
    pools:   13 pools, 11153 pgs
    objects: 313.18M objects, 1003 TiB
    usage:   1.6 PiB used, 1.6 PiB / 3.2 PiB avail
    pgs:     366564427/1681736248 objects misplaced (21.797%)
             5905 active+clean
             5139 active+remapped+backfill_wait
             109  active+remapped+backfilling

  io:
    client:   28 MiB/s rd, 258 MiB/s wr, 677 op/s rd, 772 op/s wr
    recovery: 3.5 GiB/s, 1.00k objects/s
"

Thanks again.

Mvh.

Torkil

On 18-01-2024 13:26, Torkil Svensgaard wrote:
Np. Thanks, we'll try with wpq instead as next step.

Out of curiosity, how does that work in the interim as it requires 
restarting OSDs? For a period of time we will have some OSDs on mclock 
and some on wpq?

Mvh.

Torkil

On 18/01/2024 13:11, Eugen Block wrote:
Oh, I missed that line with the mclock profile, sorry.

Zitat von Eugen Block <eblock@xxxxxx>:

Hi,

what is your current mclock profile? The default is "balanced":

quincy-1:~ # ceph config get osd osd_mclock_profile
balanced

You could try setting it to high_recovery_ops [1], or disable it 
alltogether [2]:

quincy-1:~ # ceph config set osd osd_op_queue wpq

[1] https://docs.ceph.com/en/quincy/rados/configuration/mclock- 
config-ref/
[2] https://docs.clyso.com/blog/2023/03/22/ceph-how-do-disable- 
mclock-scheduler/

Zitat von Torkil Svensgaard <torkil@xxxxxxxx>:

Hi

Our 17.2.7 cluster:

"
-33          886.00842      datacenter 714
-7          209.93135          host ceph-hdd1
-69           69.86389          host ceph-flash1
-6          188.09579          host ceph-hdd2
-3          233.57649          host ceph-hdd3
-12          184.54091          host ceph-hdd4
-34          824.47168      datacenter DCN
-73           69.86389          host ceph-flash2
-5          252.27127          host ceph-hdd14
-2          201.78067          host ceph-hdd5
-81          288.26501          host ceph-hdd6
-31          264.56207          host ceph-hdd7
-36         1284.48621      datacenter TBA
-77           69.86389          host ceph-flash3
-21          190.83224          host ceph-hdd8
-29          199.08838          host ceph-hdd9
-11          193.85382          host ceph-hdd10
-9          237.28154          host ceph-hdd11
-26          187.19536          host ceph-hdd12
-4          206.37102          host ceph-hdd13
"

We recently created an EC 4+5 pool with failure domain datacenter. 
The DCN datacenter only had 2 hdd hosts so we added one more to make 
it possible at all, since each DC needs 3 shards, as I understand it.

Backfill was really slow though, so we just added another host to 
the DCN datacenter. Backfill looks like this:

"
 data:
   volumes: 1/1 healthy
   pools:   13 pools, 11153 pgs
   objects: 311.53M objects, 1000 TiB
   usage:   1.6 PiB used, 1.6 PiB / 3.2 PiB avail
   pgs:     60/1669775060 objects degraded (0.000%)
            373356926/1669775060 objects misplaced (22.360%)
            5944 active+clean
            5177 active+remapped+backfill_wait
            22   active+remapped+backfilling
            4    active+recovery_wait+degraded+remapped
            3    active+recovery_wait+remapped
            2    active+recovery_wait+degraded
            1    active+recovering+degraded+remapped

 io:
   client:   73 MiB/s rd, 339 MiB/s wr, 1.06k op/s rd, 561 op/s wr
   recovery: 1.2 GiB/s, 313 objects/s
"

Given that the first host added had 19 OSDs, with none of them 
anywhere near the target capacity, and the one we just added has 22 
empty OSDs, having just 22 PGs backfilling and 1 recovering seems 
somewhat underwhelming.

Is this to be expected with such a pool? Mclock profile is 
high_recovery_ops.

Mvh.

Torkil

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for Magnetic Resonance DRCMR, Section 714
Copenhagen University Hospital Amager and Hvidovre
Kettegaard Allé 30, 2650 Hvidovre, Denmark
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx