Oddly, the Nautilus cluster that I'm gradually decommissioning seems to have the same shadow root pattern in its crush map. I don't know if that really means anything, but at least I know it's not something I did differently when I set up the new Reef cluster. -Dave -- Dave Hall Binghamton University kdhall@xxxxxxxxxxxxxx On Fri, Sep 20, 2024 at 12:48 PM Dave Hall <kdhall@xxxxxxxxxxxxxx> wrote: > Stefan, Anthony, > > Anthony's sequence of commands to reclassify the root failed with errors. > so I have tried to look a little deeper. > > I can now see the extra root via 'ceph osd crush tree --show-shadow'. > Looking at the decompiled crush tree, I can also see the extra root: > > root default { > id -1 # do not change unnecessarily > > * id -2 class hdd # do not change unnecessarily* # > weight 361.90518 > alg straw2 > hash 0 # rjenkins1 > item ceph00 weight 90.51434 > item ceph01 weight 90.29265 > item ceph09 weight 90.80554 > item ceph02 weight 90.29265 > } > > > Based on the hints given in the link provided by Stefan, it would appear > that the correct solution might be to get rid of 'id -2' and change id -1 > to class hdd, > > root default { > > * id -1 class hdd # do not change unnecessarily* # > weight 361.90518 > alg straw2 > hash 0 # rjenkins1 > item ceph00 weight 90.51434 > item ceph01 weight 90.29265 > item ceph09 weight 90.80554 > item ceph02 weight 90.29265 > } > > > but I'm no expert and anxious about losing data. > > The rest of the rules in my crush map are: > > # rules > rule replicated_rule { > id 0 > type replicated > step take default > step chooseleaf firstn 0 type host > step emit > } > rule block-1 { > id 1 > type erasure > step set_chooseleaf_tries 5 > step set_choose_tries 100 > step take default class hdd > step choose indep 0 type osd > step emit > } > rule default.rgw.buckets.data { > id 2 > type erasure > step set_chooseleaf_tries 5 > step set_choose_tries 100 > step take default class hdd > step choose indep 0 type osd > step emit > } > rule ceph-block { > id 3 > type erasure > step set_chooseleaf_tries 5 > step set_choose_tries 100 > step take default class hdd > step choose indep 0 type osd > step emit > } > rule replicated-hdd { > id 4 > type replicated > step take default class hdd > step choose firstn 0 type osd > step emit > } > > # end crush map > > > Of these, the last - id 4 - is one that I added while trying to figure > this out. What this tells me is that the 'take' step in rule id 0 should > probably change to 'step take default class hdd'. > > I also notice that each of my host stanzas (buckets) has what looks like > two roots. For example > > host ceph00 { > id -3 # do not change unnecessarily > id -4 class hdd # do not change unnecessarily > # weight 90.51434 > alg straw2 > hash 0 # rjenkins1 > item osd.0 weight 11.35069 > item osd.1 weight 11.35069 > item osd.2 weight 11.35069 > item osd.3 weight 11.35069 > item osd.4 weight 11.27789 > item osd.5 weight 11.27789 > item osd.6 weight 11.27789 > item osd.7 weight 11.27789 > } > > > I assume I may need to clean this up somehow, or perhaps this is the real > problem. > > Please advise. > > Thanks. > > -Dave > > -- > Dave Hall > Binghamton University > kdhall@xxxxxxxxxxxxxx > > On Thu, Sep 19, 2024 at 3:56 AM Stefan Kooman <stefan@xxxxxx> wrote: > >> On 19-09-2024 05:10, Anthony D'Atri wrote: >> > >> > >> >> >> >> Anthony, >> >> >> >> So it sounds like I need to make a new crush rule for replicated pools >> that specifies default-hdd and the device class? (Or should I go the other >> way around? I think I'd rather change the replicated pools even though >> there's more of them.) >> > >> > I think it would be best to edit the CRUSH rules in-situ so that each >> specifies the device class, that way if you do get different media in the >> future, you'll be ready. Rather than messing around with new rules and >> modifying pools, this is arguably one of the few times when one would >> decompile, edit, recompile, and inject the CRUSH map in toto. >> > >> > I haven't tried this myself, but maybe something like the below, to >> avoid the PITA and potential for error of edting the decompiled text file >> by hand. >> > >> > >> > ceph osd getcrushmap -o original.crush >> > crushtool -d original.crush -o original.txt >> > crushtool -i original.crush --reclassify --reclassify-root default hdd >> --set-subtree-class default hdd -o adjusted.crush >> > crushtool -d adjusted.crush -o adjusted.txt >> > crushtool -i original.crush --compare adjusted.crush >> > ceph osd setcrushmap -i adjusted.crush >> >> This might be of use as well (if a lot of data would move): >> https://blog.widodh.nl/2019/02/comparing-two-ceph-crush-maps/ >> >> Gr. Stefan >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx