I got a similar problem after changing pool class to use only hdd following https://www.spinics.net/lists/ceph-users/msg84987.html. Data migrated successfully. I get warnings like: 2024-12-23T14:39:37.103+0100 7f949edad640 0 [pg_autoscaler WARNING root] pool default.rgw.buckets.index won't scale due to overlapping roots: {-1, -18} 2024-12-23T14:39:37.105+0100 7f949edad640 0 [pg_autoscaler WARNING root] pool default.rgw.buckets.data won't scale due to overlapping roots: {-2, -1, -18} 2024-12-23T14:39:37.107+0100 7f949edad640 0 [pg_autoscaler WARNING root] pool cephfs_metadata won't scale due to overlapping roots: {-2, -1, -18} 2024-12-23T14:39:37.111+0100 7f949edad640 0 [pg_autoscaler WARNING root] pool 1 contains an overlapping root -1... skipping scaling ... while crush tree with shadow shows: -2 hdd 1043.93188 root default~hdd -4 hdd 151.82336 host ctplosd1~hdd 0 hdd 5.45798 osd.0 1 hdd 5.45798 osd.1 2 hdd 5.45798 osd.2 3 hdd 5.45798 osd.3 4 hdd 5.45798 osd.4 ... -1 1050.48230 root default -3 153.27872 host ctplosd1 0 hdd 5.45798 osd.0 1 hdd 5.45798 osd.1 2 hdd 5.45798 osd.2 3 hdd 5.45798 osd.3 4 hdd 5.45798 osd.4 ... and even though crush rule for example for pool 9 'default.rgw.buckets.data' erasure profile ec-32-profile size 5 min_size 4 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode on last_change 320144 lfor 0/127784/214408 flags hashpspool,ec_overwrites stripe_width 12288 application rgw is set to: { "rule_id": 1, "rule_name": "ec32", "type": 3, "steps": [ { "op": "set_chooseleaf_tries", "num": 5 }, { "op": "set_choose_tries", "num": 100 }, { "op": "take", "item": -2, "item_name": "default~hdd" }, { "op": "chooseleaf_indep", "num": 0, "type": "host" }, { "op": "emit" } ] }, and I still get warning messages. Is there a way I can check if a particular "root" is used somewhere other than go thorough ceph osd pool ls detail and look into crush rule? Can I somehow delete "old" root default? Would it be safe to change pg manually even with overlapped roots? Rok On Wed, Jan 25, 2023 at 12:03 PM Massimo Sgaravatto < massimo.sgaravatto@xxxxxxxxx> wrote: > I tried the following on a small testbed first: > > ceph osd erasure-code-profile set profile-4-2-hdd k=4 m=2 > crush-failure-domain=host crush-device-class=hdd > ceph osd crush rule create-erasure ecrule-4-2-hdd profile-4-2-hdd > ceph osd pool set ecpool-4-2 crush_rule ecrule-4-2-hdd > > and indeed after having applied this change for all the EC pools, the > autoscaler doesn't complain anymore > > Thanks a lot ! > > Cheers, Massimo > > On Tue, Jan 24, 2023 at 7:02 PM Eugen Block <eblock@xxxxxx> wrote: > > > Hi, > > > > what you can’t change with EC pools is the EC profile, the pool‘s > > ruleset you can change. The fix is the same as for the replicates > > pools, assign a ruleset with hdd class and after some data movement > > the autoscaler should not complain anymore. > > > > Regards > > Eugen > > > > Zitat von Massimo Sgaravatto <massimo.sgaravatto@xxxxxxxxx>: > > > > > Dear all > > > > > > I have just changed the crush rule for all the replicated pools in the > > > following way: > > > > > > ceph osd crush rule create-replicated replicated_hdd default host hdd > > > ceph osd pool set <poolname> crush_rule replicated_hdd > > > > > > See also this [*] thread > > > Before applying this change, these pools were all using > > > the replicated_ruleset rule where the class is not specified. > > > > > > > > > > > > I am noticing now a problem with the autoscaler: "ceph osd pool > > > autoscale-status" doesn't report any output and the mgr log complains > > about > > > overlapping roots: > > > > > > [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, -1} > > > > > > > > > Indeed: > > > > > > # ceph osd crush tree --show-shadow > > > ID CLASS WEIGHT TYPE NAME > > > -18 hdd 1329.26501 root default~hdd > > > -17 hdd 329.14154 rack Rack11-PianoAlto~hdd > > > -15 hdd 54.56085 host ceph-osd-04~hdd > > > 30 hdd 5.45609 osd.30 > > > 31 hdd 5.45609 osd.31 > > > ... > > > ... > > > -1 1329.26501 root default > > > -7 329.14154 rack Rack11-PianoAlto > > > -8 54.56085 host ceph-osd-04 > > > 30 hdd 5.45609 osd.30 > > > 31 hdd 5.45609 osd.31 > > > ... > > > > > > I have already read about this behavior but I have no clear ideas how > to > > > fix the problem. > > > > > > I read somewhere that the problem happens when there are rules that > force > > > some pools to only use one class and there are also pools which does > not > > > make any distinction between device classes > > > > > > > > > All the replicated pools are using the replicated_hdd pool but I also > > have > > > some EC pools which are using a profile where the class is not > specified. > > > As far I understand, I can't force these pools to use only the hdd > class: > > > according to the doc I can't change this profile specifying the hdd > class > > > (or at least the change wouldn't be applied to the existing EC pools) > > > > > > Any suggestions ? > > > > > > The crush map is available at > https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr, > > if > > > you want to have a look > > > > > > Many thanks, Massimo > > > > > > [*] https://www.mail-archive.com/ceph-users@xxxxxxx/msg18534.html > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx