Re: Problems with autoscaler (overlapping roots) after changing the pool class

Rok Jaklič <rjaklic@xxxxxxxxx> · Mon, 23 Dec 2024 15:01:59 +0100

I got a similar problem after changing pool class to use only hdd following
https://www.spinics.net/lists/ceph-users/msg84987.html. Data migrated
successfully.

I get warnings like:
2024-12-23T14:39:37.103+0100 7f949edad640  0 [pg_autoscaler WARNING root]
pool default.rgw.buckets.index won't scale due to overlapping roots: {-1,
-18}
2024-12-23T14:39:37.105+0100 7f949edad640  0 [pg_autoscaler WARNING root]
pool default.rgw.buckets.data won't scale due to overlapping roots: {-2,
-1, -18}
2024-12-23T14:39:37.107+0100 7f949edad640  0 [pg_autoscaler WARNING root]
pool cephfs_metadata won't scale due to overlapping roots: {-2, -1, -18}
2024-12-23T14:39:37.111+0100 7f949edad640  0 [pg_autoscaler WARNING root]
pool 1 contains an overlapping root -1... skipping scaling
...

while crush tree with shadow shows:
 -2    hdd  1043.93188  root default~hdd
 -4    hdd   151.82336      host ctplosd1~hdd
  0    hdd     5.45798          osd.0
  1    hdd     5.45798          osd.1
  2    hdd     5.45798          osd.2
  3    hdd     5.45798          osd.3
  4    hdd     5.45798          osd.4
...
 -1         1050.48230  root default
 -3          153.27872      host ctplosd1
  0    hdd     5.45798          osd.0
  1    hdd     5.45798          osd.1
  2    hdd     5.45798          osd.2
  3    hdd     5.45798          osd.3
  4    hdd     5.45798          osd.4
...

and even though crush rule for example for

pool 9 'default.rgw.buckets.data' erasure profile ec-32-profile size 5
min_size 4 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512
autoscale_mode on last_change 320144 lfor 0/127784/214408 flags
hashpspool,ec_overwrites stripe_width 12288 application rgw

is set to:
        {
            "rule_id": 1,
            "rule_name": "ec32",
            "type": 3,
            "steps": [
                {
                    "op": "set_chooseleaf_tries",
                    "num": 5
                },
                {
                    "op": "set_choose_tries",
                    "num": 100
                },
                {
                    "op": "take",
                    "item": -2,
                    "item_name": "default~hdd"
                },
                {
                    "op": "chooseleaf_indep",
                    "num": 0,
                    "type": "host"
                },
                {
                    "op": "emit"
                }
            ]
        },

and I still get warning messages.

Is there a way I can check if a particular "root" is used somewhere other
than go thorough ceph osd pool ls detail and look into crush rule?

Can I somehow delete "old" root default?

Would it be safe to change pg manually even with overlapped roots?

Rok

On Wed, Jan 25, 2023 at 12:03 PM Massimo Sgaravatto <
massimo.sgaravatto@xxxxxxxxx> wrote:

> I tried the following on a small testbed first:
>
> ceph osd erasure-code-profile set profile-4-2-hdd k=4 m=2
> crush-failure-domain=host crush-device-class=hdd
> ceph osd crush rule create-erasure ecrule-4-2-hdd profile-4-2-hdd
> ceph osd pool set ecpool-4-2 crush_rule ecrule-4-2-hdd
>
> and indeed after having applied this change for all the EC pools, the
> autoscaler doesn't complain anymore
>
> Thanks a lot !
>
> Cheers, Massimo
>
> On Tue, Jan 24, 2023 at 7:02 PM Eugen Block <eblock@xxxxxx> wrote:
>
> > Hi,
> >
> > what you can’t change with EC pools is the EC profile, the pool‘s
> > ruleset you can change. The fix is the same as for the replicates
> > pools, assign a ruleset with hdd class and after some data movement
> > the autoscaler should not complain anymore.
> >
> > Regards
> > Eugen
> >
> > Zitat von Massimo Sgaravatto <massimo.sgaravatto@xxxxxxxxx>:
> >
> > > Dear all
> > >
> > > I have just changed the crush rule for all the replicated pools in the
> > > following way:
> > >
> > > ceph osd crush rule create-replicated replicated_hdd default host hdd
> > > ceph osd pool set  <poolname> crush_rule replicated_hdd
> > >
> > > See also this [*] thread
> > > Before applying this change, these pools were all using
> > > the replicated_ruleset rule where the class is not specified.
> > >
> > >
> > >
> > > I am noticing now a problem with the autoscaler: "ceph osd pool
> > > autoscale-status" doesn't report any output and the mgr log complains
> > about
> > > overlapping roots:
> > >
> > >  [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, -1}
> > >
> > >
> > > Indeed:
> > >
> > > # ceph osd crush tree --show-shadow
> > > ID   CLASS  WEIGHT      TYPE NAME
> > > -18    hdd  1329.26501  root default~hdd
> > > -17    hdd   329.14154      rack Rack11-PianoAlto~hdd
> > > -15    hdd    54.56085          host ceph-osd-04~hdd
> > >  30    hdd     5.45609              osd.30
> > >  31    hdd     5.45609              osd.31
> > > ...
> > > ...
> > >  -1         1329.26501  root default
> > >  -7          329.14154      rack Rack11-PianoAlto
> > >  -8           54.56085          host ceph-osd-04
> > >  30    hdd     5.45609              osd.30
> > >  31    hdd     5.45609              osd.31
> > > ...
> > >
> > > I have already read about this behavior but  I have no clear ideas how
> to
> > > fix the problem.
> > >
> > > I read somewhere that the problem happens when there are rules that
> force
> > > some pools to only use one class and there are also pools which does
> not
> > > make any distinction between device classes
> > >
> > >
> > > All the replicated pools are using the replicated_hdd pool but I also
> > have
> > > some EC pools which are using a profile where the class is not
> specified.
> > > As far I understand, I can't force these pools to use only the hdd
> class:
> > > according to the doc I can't change this profile specifying the hdd
> class
> > > (or at least the change wouldn't be applied to the existing EC pools)
> > >
> > > Any suggestions ?
> > >
> > > The crush map is available at
> https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr,
> > if
> > > you want to have a look
> > >
> > > Many thanks, Massimo
> > >
> > > [*] https://www.mail-archive.com/ceph-users@xxxxxxx/msg18534.html
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx