Re: New pool created with 2048 pg_num not executed

Frank Schilder <frans@xxxxxx> · Wed, 14 Dec 2022 21:58:40 +0000

Hi Martin,

I can't find the output of

ceph osd df tree
ceph status

anywhere. I thought you posted it, but well. Could you please post the output of these commands?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Martin Buss <mbuss7004@xxxxxxxxx>
Sent: 14 December 2022 22:02:43
To: Frank Schilder; ceph-users@xxxxxxx
Cc: Eugen Block
Subject: Re:  Re: New pool created with 2048 pg_num not executed

Hi Frank,

thanks for coming in on this, setting target_max_misplaced_ratio to 1
does not help

Regards,
Martin

On 14.12.22 21:32, Frank Schilder wrote:
> Hi Eugen: déjà vu again?
>
> I think the way autoscaler code in the MGRs interferes with operations is extremely confusing.
>
> Could this be the same issue I and somebody else had a while ago? Even though autoscaler is disabled, there are parts of it in the MGR still interfering. One of the essential config options was target_max_misplaced_ratio, which needs to be set to 1 if you want to have all PGs created regardless of how many objects are misplaced.
>
> The thread was https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/WST6K5A4UQGGISBFGJEZS4HFL2VVWW32
>
> In addition, the PG splitting will stop if recovery IO is going on (some objects are degraded).
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Martin Buss <mbuss7004@xxxxxxxxx>
> Sent: 14 December 2022 19:32
> To: ceph-users@xxxxxxx
> Subject:  Re: New pool created with 2048 pg_num not executed
>
> will do, that will take another day or so.
>
> Can this have to do anything with
> osd_pg_bits that defaults to 6
> some operators seem to be working with 8 or 11
>
> Can you explain what this option means? I could not quite understand
> from the documentation.
>
> Thanks!
>
> On 14.12.22 16:11, Eugen Block wrote:
>> Then I'd suggest to wait until the backfilling is done and then report
>> back if the PGs are still not created. I don't have information about
>> the ML admin, sorry.
>>
>> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
>>
>>> that cephfs_data has been autoscaling while filling, the mismatched
>>> numbers are a result of that autoscaling
>>>
>>> the cluster status is WARN as there is still some old stuff
>>> backfilling on cephfs_data
>>>
>>> The issue is the newly created pool 9 cfs_data, which is stuck at 1152
>>> pg_num
>>>
>>> ps: can you help me to get in touch with the list admin so I can get
>>> that post including private info deleted
>>>
>>> On 14.12.22 15:41, Eugen Block wrote:
>>>> I'm wondering why the cephfs_data pool has mismatching pg_num and
>>>> pgp_num:
>>>>
>>>>> pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
>>>>> object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off
>>>>
>>>> Does disabling the autoscaler leave it like that when you disable it
>>>> in the middle of scaling? What is the current 'ceph status'?
>>>>
>>>>
>>>> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
>>>>
>>>>> Hi Eugen,
>>>>>
>>>>> thanks, sure, below:
>>>>>
>>>>> pg_num stuck at 1152 and pgp_num stuck at 1024
>>>>>
>>>>> Regards,
>>>>>
>>>>> Martin
>>>>>
>>>>> ceph config set global mon_max_pg_per_osd 400
>>>>>
>>>>> ceph osd pool create cfs_data 2048 2048 --pg_num_min 2048
>>>>> pool 'cfs_data' created
>>>>>
>>>>> pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
>>>>> object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off
>>>>> last_change 3099 lfor 0/3089/3096 flags hashpspool,bulk stripe_width
>>>>> 0 target_size_ratio 1 application cephfs
>>>>> pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
>>>>> object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode off
>>>>> last_change 2942 lfor 0/0/123 flags hashpspool stripe_width 0
>>>>> pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application
>>>>> cephfs
>>>>> pool 3 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
>>>>> rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 2943
>>>>> flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1
>>>>> application mgr
>>>>> pool 9 'cfs_data' replicated size 3 min_size 2 crush_rule 0
>>>>> object_hash rjenkins pg_num 1152 pgp_num 1024 pg_num_target 2048
>>>>> pgp_num_target 2048 autoscale_mode off last_change 3198 lfor
>>>>> 0/0/3198 flags hashpspool stripe_width 0 pg_num_min 2048
>>>>>
>>>>>
>>>>>
>>>>> On 14.12.22 15:10, Eugen Block wrote:
>>>>>> Hi,
>>>>>>
>>>>>> are there already existing pools in the cluster? Can you share your
>>>>>> 'ceph osd df tree' as well as 'ceph osd pool ls detail'? It sounds
>>>>>> like ceph is trying to stay within the limit of mon_max_pg_per_osd
>>>>>> (default 250).
>>>>>>
>>>>>> Regards,
>>>>>> Eugen
>>>>>>
>>>>>> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> on quincy, I created a new pool
>>>>>>>
>>>>>>> ceph osd pool create cfs_data 2048 2048
>>>>>>>
>>>>>>> 6 hosts 71 osds
>>>>>>>
>>>>>>> autoscaler is off; I find it kind of strange that the pool is
>>>>>>> created with pg_num 1152 and pgp_num 1024, mentioning the 2048 as
>>>>>>> the new target. I cannot manage to actually make this pool contain
>>>>>>> 2048 pg_num and 2048 pgp_num.
>>>>>>>
>>>>>>> What config option am I missing that does not allow me to grow the
>>>>>>> pool to 2048? Although I specified pg_num and pgp_num be the same,
>>>>>>> it is not.
>>>>>>>
>>>>>>> Please some help and guidance.
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Martin
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsu
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx