Re: Very slow backfilling/remapping of EC pool PGs

Gauvain Pocentek <gauvainpocentek@xxxxxxxxx> · Tue, 21 Mar 2023 12:51:36 +0100

(adding back the list)

On Tue, Mar 21, 2023 at 11:25 AM Joachim Kraftmayer <
joachim.kraftmayer@xxxxxxxxx> wrote:

> i added the questions and answers below.
>
> ___________________________________
> Best Regards,
> Joachim Kraftmayer
> CEO | Clyso GmbH
>
> Clyso GmbH
> p: +49 89 21 55 23 91 2
> a: Loristraße 8 | 80335 München | Germany
> w: https://clyso.com | e: joachim.kraftmayer@xxxxxxxxx
>
> We are hiring: https://www.clyso.com/jobs/
> ---
> CEO: Dipl. Inf. (FH) Joachim Kraftmayer
> Unternehmenssitz: Utting am Ammersee
> Handelsregister beim Amtsgericht: Augsburg
> Handelsregister-Nummer: HRB 25866
> USt. ID-Nr.: DE275430677
>
> Am 21.03.23 um 11:14 schrieb Gauvain Pocentek:
>
> Hi Joachim,
>
>
> On Tue, Mar 21, 2023 at 10:13 AM Joachim Kraftmayer <
> joachim.kraftmayer@xxxxxxxxx> wrote:
>
>> Which Ceph version are you running, is mclock active?
>>
>>
> We're using Quincy (17.2.5), upgraded step by step from Luminous if I
> remember correctly.
>
> did you recreate the osds? if yes, at which version?
>

I actually don't remember all the history, but I think we added the HDD
nodes while running Pacific.

>
> mlock seems active, set to high_client_ops profile. HDD OSDs have very
> different settings for max capacity iops:
>
> osd.137        basic     osd_mclock_max_capacity_iops_hdd
>  929.763899
> osd.161        basic     osd_mclock_max_capacity_iops_hdd
>  4754.250946
> osd.222        basic     osd_mclock_max_capacity_iops_hdd
>  540.016984
> osd.281        basic     osd_mclock_max_capacity_iops_hdd
>  1029.193945
> osd.282        basic     osd_mclock_max_capacity_iops_hdd
>  1061.762870
> osd.283        basic     osd_mclock_max_capacity_iops_hdd
>  462.984562
>
> We haven't set those explicitly, could they be the reason of the slow
> recovery?
>
> i recommend to disable mclock for now, and yes we have seen slow recovery
> caused by mclock.
>

Stupid question: how do you do that? I've looked through the docs but could
only find information about changing the settings.

>
>
> Bonus question: does ceph set that itself?
>
> yes and if you have a setup with HDD + SSD (db & wal) the discovery works
> not in the right way.
>

Good to know!

Gauvain

>
> Thanks!
>
> Gauvain
>
>
>
>
>> Joachim
>>
>> ___________________________________
>> Clyso GmbH - Ceph Foundation Member
>>
>> Am 21.03.23 um 06:53 schrieb Gauvain Pocentek:
>> > Hello all,
>> >
>> > We have an EC (4+2) pool for RGW data, with HDDs + SSDs for WAL/DB. This
>> > pool has 9 servers with each 12 disks of 16TBs. About 10 days ago we
>> lost a
>> > server and we've removed its OSDs from the cluster. Ceph has started to
>> > remap and backfill as expected, but the process has been getting slower
>> and
>> > slower. Today the recovery rate is around 12 MiB/s and 10 objects/s. All
>> > the remaining unclean PGs are backfilling:
>> >
>> >    data:
>> >      volumes: 1/1 healthy
>> >      pools:   14 pools, 14497 pgs
>> >      objects: 192.38M objects, 380 TiB
>> >      usage:   764 TiB used, 1.3 PiB / 2.1 PiB avail
>> >      pgs:     771559/1065561630 objects degraded (0.072%)
>> >               1215899/1065561630 objects misplaced (0.114%)
>> >               14428 active+clean
>> >               50    active+undersized+degraded+remapped+backfilling
>> >               18    active+remapped+backfilling
>> >               1     active+clean+scrubbing+deep
>> >
>> > We've checked the health of the remaining servers, and everything looks
>> > like (CPU/RAM/network/disks).
>> >
>> > Any hints on what could be happening?
>> >
>> > Thank you,
>> > Gauvain
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx