Hi Martin, I can't find the output of ceph osd df tree ceph status anywhere. I thought you posted it, but well. Could you please post the output of these commands? Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Martin Buss <mbuss7004@xxxxxxxxx> Sent: 14 December 2022 22:02:43 To: Frank Schilder; ceph-users@xxxxxxx Cc: Eugen Block Subject: Re: Re: New pool created with 2048 pg_num not executed Hi Frank, thanks for coming in on this, setting target_max_misplaced_ratio to 1 does not help Regards, Martin On 14.12.22 21:32, Frank Schilder wrote: > Hi Eugen: déjà vu again? > > I think the way autoscaler code in the MGRs interferes with operations is extremely confusing. > > Could this be the same issue I and somebody else had a while ago? Even though autoscaler is disabled, there are parts of it in the MGR still interfering. One of the essential config options was target_max_misplaced_ratio, which needs to be set to 1 if you want to have all PGs created regardless of how many objects are misplaced. > > The thread was https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/WST6K5A4UQGGISBFGJEZS4HFL2VVWW32 > > In addition, the PG splitting will stop if recovery IO is going on (some objects are degraded). > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Martin Buss <mbuss7004@xxxxxxxxx> > Sent: 14 December 2022 19:32 > To: ceph-users@xxxxxxx > Subject: Re: New pool created with 2048 pg_num not executed > > will do, that will take another day or so. > > Can this have to do anything with > osd_pg_bits that defaults to 6 > some operators seem to be working with 8 or 11 > > Can you explain what this option means? I could not quite understand > from the documentation. > > Thanks! > > On 14.12.22 16:11, Eugen Block wrote: >> Then I'd suggest to wait until the backfilling is done and then report >> back if the PGs are still not created. I don't have information about >> the ML admin, sorry. >> >> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>: >> >>> that cephfs_data has been autoscaling while filling, the mismatched >>> numbers are a result of that autoscaling >>> >>> the cluster status is WARN as there is still some old stuff >>> backfilling on cephfs_data >>> >>> The issue is the newly created pool 9 cfs_data, which is stuck at 1152 >>> pg_num >>> >>> ps: can you help me to get in touch with the list admin so I can get >>> that post including private info deleted >>> >>> On 14.12.22 15:41, Eugen Block wrote: >>>> I'm wondering why the cephfs_data pool has mismatching pg_num and >>>> pgp_num: >>>> >>>>> pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 >>>>> object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off >>>> >>>> Does disabling the autoscaler leave it like that when you disable it >>>> in the middle of scaling? What is the current 'ceph status'? >>>> >>>> >>>> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>: >>>> >>>>> Hi Eugen, >>>>> >>>>> thanks, sure, below: >>>>> >>>>> pg_num stuck at 1152 and pgp_num stuck at 1024 >>>>> >>>>> Regards, >>>>> >>>>> Martin >>>>> >>>>> ceph config set global mon_max_pg_per_osd 400 >>>>> >>>>> ceph osd pool create cfs_data 2048 2048 --pg_num_min 2048 >>>>> pool 'cfs_data' created >>>>> >>>>> pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 >>>>> object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off >>>>> last_change 3099 lfor 0/3089/3096 flags hashpspool,bulk stripe_width >>>>> 0 target_size_ratio 1 application cephfs >>>>> pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 >>>>> object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode off >>>>> last_change 2942 lfor 0/0/123 flags hashpspool stripe_width 0 >>>>> pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application >>>>> cephfs >>>>> pool 3 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash >>>>> rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 2943 >>>>> flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 >>>>> application mgr >>>>> pool 9 'cfs_data' replicated size 3 min_size 2 crush_rule 0 >>>>> object_hash rjenkins pg_num 1152 pgp_num 1024 pg_num_target 2048 >>>>> pgp_num_target 2048 autoscale_mode off last_change 3198 lfor >>>>> 0/0/3198 flags hashpspool stripe_width 0 pg_num_min 2048 >>>>> >>>>> >>>>> >>>>> On 14.12.22 15:10, Eugen Block wrote: >>>>>> Hi, >>>>>> >>>>>> are there already existing pools in the cluster? Can you share your >>>>>> 'ceph osd df tree' as well as 'ceph osd pool ls detail'? It sounds >>>>>> like ceph is trying to stay within the limit of mon_max_pg_per_osd >>>>>> (default 250). >>>>>> >>>>>> Regards, >>>>>> Eugen >>>>>> >>>>>> Zitat von Martin Buss <mbuss7004@xxxxxxxxx>: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> on quincy, I created a new pool >>>>>>> >>>>>>> ceph osd pool create cfs_data 2048 2048 >>>>>>> >>>>>>> 6 hosts 71 osds >>>>>>> >>>>>>> autoscaler is off; I find it kind of strange that the pool is >>>>>>> created with pg_num 1152 and pgp_num 1024, mentioning the 2048 as >>>>>>> the new target. I cannot manage to actually make this pool contain >>>>>>> 2048 pg_num and 2048 pgp_num. >>>>>>> >>>>>>> What config option am I missing that does not allow me to grow the >>>>>>> pool to 2048? Although I specified pg_num and pgp_num be the same, >>>>>>> it is not. >>>>>>> >>>>>>> Please some help and guidance. >>>>>>> >>>>>>> Thank you, >>>>>>> >>>>>>> Martin >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsu > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx