If your NVMe OSDs have the `ssd` device class, doing what you suggest might not even result in any data movement. https://docs.ceph.com/en/reef/rados/operations/crush-map-edits/#migrating-from-a-legacy-ssd-rule-to-device-classes This page shows how to use the reclassify feature to help avoid typos when editing the CRUSHmap. Using a CLI tool when feasible makes this sort of thing a lot safer, compared to back in the day when we had to text-edit everything by hand :nailbiting:. One can readily diff the before and after decompiled text CRUSHmaps to ensure sanity before recompiling and injecting. I’ve done this myself multiple times since device classes became a thing. > On Dec 23, 2024, at 5:05 PM, Rok Jaklič <rjaklic@xxxxxxxxx> wrote: > > I will try changing/adding class ssd to replicated_rule tomorrow even > though I am a little hesitant for some reason to edit this rule since it > could mean that system data for rgw would "stay somewhere" if something > goes wrong. I was much braver when I was changing the rule for EC32 where I > separated OSD data to just hdd, since "some data" was already on hdd. > > > On Mon, Dec 23, 2024 at 4:12 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> > wrote: > >> Agreed. The .mgr pool is a usual suspect here, especially when using >> Rook. When any pool is constrained to a device class, this kind of warning >> will happen if *all* pools don’t specify one. >> >> Of course there’s also the strategy of disabling the autoscaler, but that >> takes more analysis. We old farts are used to it, but it can be daunting >> for whippersnappers. >> >>> On Dec 23, 2024, at 9:11 AM, Eugen Block <eblock@xxxxxx> wrote: >>> >>> Don't try to delete a root, that will definitely break something. >> Instead, check the crush rules which don't use a device class and use the >> reclassify of the crushtool to modify the rules. This will trigger only a >> bit of data movement, but not as much as a simple change of the rule would. >>> >>> Zitat von Rok Jaklič <rjaklic@xxxxxxxxx>: >>> >>>> I got a similar problem after changing pool class to use only hdd >> following >>>> https://www.spinics.net/lists/ceph-users/msg84987.html. Data migrated >>>> successfully. >>>> >>>> I get warnings like: >>>> 2024-12-23T14:39:37.103+0100 7f949edad640 0 [pg_autoscaler WARNING >> root] >>>> pool default.rgw.buckets.index won't scale due to overlapping roots: >> {-1, >>>> -18} >>>> 2024-12-23T14:39:37.105+0100 7f949edad640 0 [pg_autoscaler WARNING >> root] >>>> pool default.rgw.buckets.data won't scale due to overlapping roots: {-2, >>>> -1, -18} >>>> 2024-12-23T14:39:37.107+0100 7f949edad640 0 [pg_autoscaler WARNING >> root] >>>> pool cephfs_metadata won't scale due to overlapping roots: {-2, -1, -18} >>>> 2024-12-23T14:39:37.111+0100 7f949edad640 0 [pg_autoscaler WARNING >> root] >>>> pool 1 contains an overlapping root -1... skipping scaling >>>> ... >>>> >>>> while crush tree with shadow shows: >>>> -2 hdd 1043.93188 root default~hdd >>>> -4 hdd 151.82336 host ctplosd1~hdd >>>> 0 hdd 5.45798 osd.0 >>>> 1 hdd 5.45798 osd.1 >>>> 2 hdd 5.45798 osd.2 >>>> 3 hdd 5.45798 osd.3 >>>> 4 hdd 5.45798 osd.4 >>>> ... >>>> -1 1050.48230 root default >>>> -3 153.27872 host ctplosd1 >>>> 0 hdd 5.45798 osd.0 >>>> 1 hdd 5.45798 osd.1 >>>> 2 hdd 5.45798 osd.2 >>>> 3 hdd 5.45798 osd.3 >>>> 4 hdd 5.45798 osd.4 >>>> ... >>>> >>>> and even though crush rule for example for >>>> >>>> pool 9 'default.rgw.buckets.data' erasure profile ec-32-profile size 5 >>>> min_size 4 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 >>>> autoscale_mode on last_change 320144 lfor 0/127784/214408 flags >>>> hashpspool,ec_overwrites stripe_width 12288 application rgw >>>> >>>> is set to: >>>> { >>>> "rule_id": 1, >>>> "rule_name": "ec32", >>>> "type": 3, >>>> "steps": [ >>>> { >>>> "op": "set_chooseleaf_tries", >>>> "num": 5 >>>> }, >>>> { >>>> "op": "set_choose_tries", >>>> "num": 100 >>>> }, >>>> { >>>> "op": "take", >>>> "item": -2, >>>> "item_name": "default~hdd" >>>> }, >>>> { >>>> "op": "chooseleaf_indep", >>>> "num": 0, >>>> "type": "host" >>>> }, >>>> { >>>> "op": "emit" >>>> } >>>> ] >>>> }, >>>> >>>> and I still get warning messages. >>>> >>>> Is there a way I can check if a particular "root" is used somewhere >> other >>>> than go thorough ceph osd pool ls detail and look into crush rule? >>>> >>>> Can I somehow delete "old" root default? >>>> >>>> Would it be safe to change pg manually even with overlapped roots? >>>> >>>> Rok >>>> >>>> >>>> On Wed, Jan 25, 2023 at 12:03 PM Massimo Sgaravatto < >>>> massimo.sgaravatto@xxxxxxxxx> wrote: >>>> >>>>> I tried the following on a small testbed first: >>>>> >>>>> ceph osd erasure-code-profile set profile-4-2-hdd k=4 m=2 >>>>> crush-failure-domain=host crush-device-class=hdd >>>>> ceph osd crush rule create-erasure ecrule-4-2-hdd profile-4-2-hdd >>>>> ceph osd pool set ecpool-4-2 crush_rule ecrule-4-2-hdd >>>>> >>>>> and indeed after having applied this change for all the EC pools, the >>>>> autoscaler doesn't complain anymore >>>>> >>>>> Thanks a lot ! >>>>> >>>>> Cheers, Massimo >>>>> >>>>> On Tue, Jan 24, 2023 at 7:02 PM Eugen Block <eblock@xxxxxx> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> what you can’t change with EC pools is the EC profile, the pool‘s >>>>>> ruleset you can change. The fix is the same as for the replicates >>>>>> pools, assign a ruleset with hdd class and after some data movement >>>>>> the autoscaler should not complain anymore. >>>>>> >>>>>> Regards >>>>>> Eugen >>>>>> >>>>>> Zitat von Massimo Sgaravatto <massimo.sgaravatto@xxxxxxxxx>: >>>>>> >>>>>>> Dear all >>>>>>> >>>>>>> I have just changed the crush rule for all the replicated pools in >> the >>>>>>> following way: >>>>>>> >>>>>>> ceph osd crush rule create-replicated replicated_hdd default host >> hdd >>>>>>> ceph osd pool set <poolname> crush_rule replicated_hdd >>>>>>> >>>>>>> See also this [*] thread >>>>>>> Before applying this change, these pools were all using >>>>>>> the replicated_ruleset rule where the class is not specified. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I am noticing now a problem with the autoscaler: "ceph osd pool >>>>>>> autoscale-status" doesn't report any output and the mgr log >> complains >>>>>> about >>>>>>> overlapping roots: >>>>>>> >>>>>>> [pg_autoscaler ERROR root] pool xyz has overlapping roots: {-18, >> -1} >>>>>>> >>>>>>> >>>>>>> Indeed: >>>>>>> >>>>>>> # ceph osd crush tree --show-shadow >>>>>>> ID CLASS WEIGHT TYPE NAME >>>>>>> -18 hdd 1329.26501 root default~hdd >>>>>>> -17 hdd 329.14154 rack Rack11-PianoAlto~hdd >>>>>>> -15 hdd 54.56085 host ceph-osd-04~hdd >>>>>>> 30 hdd 5.45609 osd.30 >>>>>>> 31 hdd 5.45609 osd.31 >>>>>>> ... >>>>>>> ... >>>>>>> -1 1329.26501 root default >>>>>>> -7 329.14154 rack Rack11-PianoAlto >>>>>>> -8 54.56085 host ceph-osd-04 >>>>>>> 30 hdd 5.45609 osd.30 >>>>>>> 31 hdd 5.45609 osd.31 >>>>>>> ... >>>>>>> >>>>>>> I have already read about this behavior but I have no clear ideas >> how >>>>> to >>>>>>> fix the problem. >>>>>>> >>>>>>> I read somewhere that the problem happens when there are rules that >>>>> force >>>>>>> some pools to only use one class and there are also pools which >> does >>>>> not >>>>>>> make any distinction between device classes >>>>>>> >>>>>>> >>>>>>> All the replicated pools are using the replicated_hdd pool but I >> also >>>>>> have >>>>>>> some EC pools which are using a profile where the class is not >>>>> specified. >>>>>>> As far I understand, I can't force these pools to use only the hdd >>>>> class: >>>>>>> according to the doc I can't change this profile specifying the hdd >>>>> class >>>>>>> (or at least the change wouldn't be applied to the existing EC >> pools) >>>>>>> >>>>>>> Any suggestions ? >>>>>>> >>>>>>> The crush map is available at >>>>> https://cernbox.cern.ch/s/gIyjbQbmoTFHCrr, >>>>>> if >>>>>>> you want to have a look >>>>>>> >>>>>>> Many thanks, Massimo >>>>>>> >>>>>>> [*] https://www.mail-archive.com/ceph-users@xxxxxxx/msg18534.html >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx