Re: Problems with autoscaler (overlapping roots) after changing the pool class

"Anthony D'Atri" <anthony.datri@xxxxxxxxx> · Mon, 23 Dec 2024 10:12:09 -0500

Agreed.  The .mgr pool is a usual suspect here, especially when using Rook.  When any pool is constrained to a device class, this kind of warning will happen if *all* pools don’t specify one.

Of course there’s also the strategy of disabling the autoscaler, but that takes more analysis.  We old farts are used to it, but it can be daunting for whippersnappers.

> On Dec 23, 2024, at 9:11 AM, Eugen Block <eblock@xxxxxx> wrote:
> 
> Don't try to delete a root, that will definitely break something. Instead, check the crush rules which don't use a device class and use the reclassify of the crushtool to modify the rules. This will trigger only a bit of data movement, but not as much as a simple change of the rule would.
> 
> Zitat von Rok Jaklič <rjaklic@xxxxxxxxx>:
> 
>> I got a similar problem after changing pool class to use only hdd following
>> https://www.spinics.net/lists/ceph-users/msg84987.html. Data migrated
>> successfully.
>> 
>> I get warnings like:
>> 2024-12-23T14:39:37.103+0100 7f949edad640  0 [pg_autoscaler WARNING root]
>> pool default.rgw.buckets.index won't scale due to overlapping roots: {-1,
>> -18}
>> 2024-12-23T14:39:37.105+0100 7f949edad640  0 [pg_autoscaler WARNING root]
>> pool default.rgw.buckets.data won't scale due to overlapping roots: {-2,
>> -1, -18}
>> 2024-12-23T14:39:37.107+0100 7f949edad640  0 [pg_autoscaler WARNING root]
>> pool cephfs_metadata won't scale due to overlapping roots: {-2, -1, -18}
>> 2024-12-23T14:39:37.111+0100 7f949edad640  0 [pg_autoscaler WARNING root]
>> pool 1 contains an overlapping root -1... skipping scaling
>> ...
>> 
>> while crush tree with shadow shows:
>> -2    hdd  1043.93188  root default~hdd
>> -4    hdd   151.82336      host ctplosd1~hdd
>>  0    hdd     5.45798          osd.0
>>  1    hdd     5.45798          osd.1
>>  2    hdd     5.45798          osd.2
>>  3    hdd     5.45798          osd.3
>>  4    hdd     5.45798          osd.4
>> ...
>> -1         1050.48230  root default
>> -3          153.27872      host ctplosd1
>>  0    hdd     5.45798          osd.0
>>  1    hdd     5.45798          osd.1
>>  2    hdd     5.45798          osd.2
>>  3    hdd     5.45798          osd.3
>>  4    hdd     5.45798          osd.4
>> ...
>> 
>> and even though crush rule for example for
>> 
>> pool 9 'default.rgw.buckets.data' erasure profile ec-32-profile size 5
>> min_size 4 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512
>> autoscale_mode on last_change 320144 lfor 0/127784/214408 flags
>> hashpspool,ec_overwrites stripe_width 12288 application rgw
>> 
>> is set to:
>>        {
>>            "rule_id": 1,
>>            "rule_name": "ec32",
>>            "type": 3,
>>            "steps": [
>>                {
>>                    "op": "set_chooseleaf_tries",
>>                    "num": 5
>>                },
>>                {
>>                    "op": "set_choose_tries",
>>                    "num": 100
>>                },
>>                {
>>                    "op": "take",
>>                    "item": -2,
>>                    "item_name": "default~hdd"
>>                },
>>                {
>>                    "op": "chooseleaf_indep",
>>                    "num": 0,
>>                    "type": "host"
>>                },
>>                {
>>                    "op": "emit"
>>                }
>>            ]
>>        },
>> 
>> and I still get warning messages.
>> 
>> Is there a way I can check if a particular "root" is used somewhere other
>> than go thorough ceph osd pool ls detail and look into crush rule?
>> 
>> Can I somehow delete "old" root default?
>> 
>> Would it be safe to change pg manually even with overlapped roots?
>> 
>> Rok
>> 
>> 
>> On Wed, Jan 25, 2023 at 12:03 PM Massimo Sgaravatto <
>> massimo.sgaravatto@xxxxxxxxx> wrote:
>> 
>>> I tried the following on a small testbed first:
>>> 
>>> ceph osd erasure-code-profile set profile-4-2-hdd k=4 m=2
>>> crush-failure-domain=host crush-device-class=hdd
>>> ceph osd crush rule create-erasure ecrule-4-2-hdd profile-4-2-hdd
>>> ceph osd pool set ecpool-4-2 crush_rule ecrule-4-2-hdd
>>> 
>>> and indeed after having applied this change for all the EC pools, the
>>> autoscaler doesn't complain anymore
>>> 
>>> Thanks a lot !
>>> 
>>> Cheers, Massimo
>>> 
>>> On Tue, Jan 24, 2023 at 7:02 PM Eugen Block <eblock@xxxxxx> wrote:
>>> 
>>> > Hi,
>>> >
>>> > what you can’t change with EC pools is the EC profile, the pool‘s
>>> > ruleset you can change. The fix is the same as for the replicates
>>> > pools, assign a ruleset with hdd class and after some data movement
>>> > the autoscaler should not complain anymore.
>>> >
>>> > Regards
>>> > Eugen
>>> >
>>> > Zitat von Massimo Sgaravatto <massimo.sgaravatto@xxxxxxxxx>:
>>> >
>>> > > Dear all
>>> > >
>>> > > I have just changed the crush rule for all the replicated pools in the
>>> > > following way:
>>> > >
>>> > > ceph osd crush rule create-replicated replicated_hdd default host hdd
>>> > > ceph osd pool set  <poolname> crush_rule replicated_hdd
>>> > >
>>> > > See also this [*] thread
>>> > > Before applying this change, these pools were all using
>>> > > the replicated_ruleset rule where the class is not specified.
>>> > >
>>> > >
>>> > >
>>> > > I am noticing now a problem with the autoscaler: "ceph osd pool
>>> > > autoscale-status" doesn't report any output and the mgr log complains
>>> > about
>>> > > overlapping roots:
>>> > >
>>> > >  [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, -1}
>>> > >
>>> > >
>>> > > Indeed:
>>> > >
>>> > > # ceph osd crush tree --show-shadow
>>> > > ID   CLASS  WEIGHT      TYPE NAME
>>> > > -18    hdd  1329.26501  root default~hdd
>>> > > -17    hdd   329.14154      rack Rack11-PianoAlto~hdd
>>> > > -15    hdd    54.56085          host ceph-osd-04~hdd
>>> > >  30    hdd     5.45609              osd.30
>>> > >  31    hdd     5.45609              osd.31
>>> > > ...
>>> > > ...
>>> > >  -1         1329.26501  root default
>>> > >  -7          329.14154      rack Rack11-PianoAlto
>>> > >  -8           54.56085          host ceph-osd-04
>>> > >  30    hdd     5.45609              osd.30
>>> > >  31    hdd     5.45609              osd.31
>>> > > ...
>>> > >
>>> > > I have already read about this behavior but  I have no clear ideas how
>>> to
>>> > > fix the problem.
>>> > >
>>> > > I read somewhere that the problem happens when there are rules that
>>> force
>>> > > some pools to only use one class and there are also pools which does
>>> not
>>> > > make any distinction between device classes
>>> > >
>>> > >
>>> > > All the replicated pools are using the replicated_hdd pool but I also
>>> > have
>>> > > some EC pools which are using a profile where the class is not
>>> specified.
>>> > > As far I understand, I can't force these pools to use only the hdd
>>> class:
>>> > > according to the doc I can't change this profile specifying the hdd
>>> class
>>> > > (or at least the change wouldn't be applied to the existing EC pools)
>>> > >
>>> > > Any suggestions ?
>>> > >
>>> > > The crush map is available at
>>> https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr,
>>> > if
>>> > > you want to have a look
>>> > >
>>> > > Many thanks, Massimo
>>> > >
>>> > > [*] https://www.mail-archive.com/ceph-users@xxxxxxx/msg18534.html
>>> > > _______________________________________________
>>> > > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> 
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx