Re: Adding device class to CRUSH rule without data movement

Hector Martin <marcan@xxxxxxxxx> · Sat, 15 Mar 2025 21:50:31 +0900

What the command I pasted did was (after diffing the CRUSH map output):

- Move all the IDs from the main tree into the hdd class parallel tree
- Assign fresh IDs to the main tree
- Change all the "step take default" to "step take default class hdd" in
every CRUSH rule

Moving the IDs over is what did the trick to avoid data movement. Of
course, that only works if you do it for every rule, since any rules
left at "step take default" will now see a different tree with different
IDs and their data will move.

On 2025/03/15 12:00, Anthony D'Atri wrote:
> 
> I haven’t used reclassify in a while, but does the output look like the below, specifying the device class with item_name ?
> 
>         {
>             "rule_id": 3,
>             "rule_name": "tsECpool",
>             "type": 3,
>             "steps": [
>                 {
>                     "op": "set_chooseleaf_tries",
>                     "num": 5
>                 },
>                 {
>                     "op": "set_choose_tries",
>                     "num": 100
>                 },
>                 {
>                     "op": "take",
>                     "item": -2,
>                     "item_name": "default~hdd"
>                 },
>                 {
>                     "op": "chooseleaf_indep",
>                     "num": 0,
>                     "type": "host"
>                 },
>                 {
>                     "op": "emit"
>                 }
>             ]
>         },
> 
>> Aha, that's what I was looking for! And indeed, it seems to do exactly
>> what I thought of doing, just move the bucket IDs. I mistakenly thought
>> this functionality was for people who already had multiple parallel
>> hierarchies in their crush tree, but it also works for a single default
>> hierarchy.
>>
>> This did the trick:
>>
>> crushtool -i crush.old --reclassify --reclassify-root default hdd -o crush
>>
>> Thanks!
>>
>> On 2025/03/14 23:17, Eugen Block wrote:
>>> The crushtool would do that with the --reclassify flag. There was a  
>>> thread here on this list a couple of months ago. I’m on my mobile, I  
>>> don’t have a link for you right now. But the docs should also contain  
>>> some examples, if I’m not mistaken.
>>>
>>>
>>> Zitat von Hector Martin <marcan@xxxxxxxxx>:
>>>
>>>> Hi,
>>>>
>>>> I have an old Mimic cluster that I'm doing some cleanup work on and
>>>> adding SSDs, before upgrading to a newer version.
>>>>
>>>> As part of adding SSDs, I need to switch the existing CRUSH rules to
>>>> only use the HDD device class first. Is there some way of doing this
>>>> that doesn't result in 100% data movement?
>>>>
>>>> Simply replacing `step take default` with `step take default class hdd`
>>>> for every CRUSH rule seems to completely shuffle the cluster data. I
>>>> tried manually specifying the bucket IDs for the hdd hierarchy to at
>>>> least be in the same order as the bucket IDs for the primary hierarchy,
>>>> hoping having them sort the same would end up with the same data
>>>> distribution, but that didn't work either.
>>>>
>>>> Is there some magic incantation to swap around the CRUSH rules/tree so
>>>> that it results in exactly the same data distribution after adding the
>>>> hdd class constraint? The set of potential OSDs should be identical
>>>> (there are no SSDs yet), so the data movement seems to be some
>>>> technicality of the CRUSH implementation... perhaps completely switching
>>>> around the main id and hdd-class id of all the buckets would do it? (I'm
>>>> a little afraid to mess with the main ids in a production cluster...).
>>>>
>>>> This cluster is already having I/O load issues (that's part of why I'm
>>>> adding SSDs), so I'd really like to avoid a total data shuffle if possible.
>>>>
>>>> Thanks,
>>>> - Hector
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>> - Hector
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> 

- Hector
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx