Re: cephadm crush_device_class not applied

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not sure, but that's going to break with a lot of people's Pacific specifications when they upgrade. We heavily utilize this functionality, and use different device class names for a lot of good reasons. This seems like a regression to me.

David

On Thu, Oct 3, 2024, at 16:20, Eugen Block wrote:
> I think this PR [1] is responsible. And here are the three supported  
> classes [2]:
>
> class to_ceph_volume(object):
>
>      _supported_device_classes = [
>          "hdd", "ssd", "nvme"
>      ]
>
> Why this limitation?
>
> [1] https://github.com/ceph/ceph/pull/49555
> [2]  
> https://github.com/ceph/ceph/blob/v18.2.2/src/python-common/ceph/deployment/translate.py#L14
>
> Zitat von Eugen Block <eblock@xxxxxx>:
>
>> It works as expected in Pacific 16.2.15 (at least how I expect it to  
>> work). I applied the same spec file and now have my custom device  
>> classes (the test class was the result of a manual daemon add  
>> command):
>>
>> soc9-ceph:~ # ceph osd tree
>> ID  CLASS   WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
>> -1          0.05878  root default
>> -3          0.05878      host soc9-ceph
>>  1  hdd-ec  0.00980          osd.1           up   1.00000  1.00000
>>  2  hdd-ec  0.00980          osd.2           up   1.00000  1.00000
>>  3  hdd-ec  0.00980          osd.3           up   1.00000  1.00000
>>  4  hdd-ec  0.00980          osd.4           up   1.00000  1.00000
>>  5  hdd-ec  0.00980          osd.5           up   1.00000  1.00000
>>  0    test  0.00980          osd.0           up   1.00000  1.00000
>>
>> So apparently, there was a change since Quincy. For me it's a  
>> regression, or is this even a bug? I'd appreciate any comments.
>>
>> Zitat von Eugen Block <eblock@xxxxxx>:
>>
>>> Apparently, I can only use "well known" device classes in the  
>>> specs, like nvme, ssd or hdd. Every other string (even without  
>>> hyphens etc.) doesn't work.
>>>
>>> Zitat von Eugen Block <eblock@xxxxxx>:
>>>
>>>> Reading the docs again, I noticed that apparently the keyword  
>>>> "paths" is required to use with crush_device_class (why?), but  
>>>> that doesn't work either. I tried it by specifying the class both  
>>>> globally in the spec file as well as per device, still no change,  
>>>> the OSDs come up as "hdd".
>>>>
>>>> Zitat von Eugen Block <eblock@xxxxxx>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm struggling to create OSDs with a dedicated  
>>>>> crush_device_class. It worked sometimes when creating a new osd  
>>>>> via command line (ceph orch daemon add osd  
>>>>> host:data_devices=/dev/vdg,crush_device_class=test-hdd), but most  
>>>>> of the time it doesn't work. I tried it with a spec file as well,  
>>>>> it seems to be correctly parsed and everything, but the new OSDs  
>>>>> are created with hdd class, not hdd-ec. I have this spec:
>>>>>
>>>>> cat osd-class.yaml
>>>>> service_type: osd
>>>>> service_id: hdd-ec
>>>>> service_name: hdd-ec
>>>>> crush_device_class: hdd-ec
>>>>> placement:
>>>>> label: osd
>>>>> spec:
>>>>> data_devices:
>>>>>  rotational: 1
>>>>>  size: 10G
>>>>> objectstore: bluestore
>>>>>
>>>>> I see that cephadm has stored it correctly:
>>>>>
>>>>> ceph config-key get mgr/cephadm/spec.osd.hdd-ec
>>>>> {"created": "2024-10-03T08:35:41.364216Z", "needs_configuration":  
>>>>> true, "spec": {"placement": {"label": "osd"}, "service_id":  
>>>>> "hdd-ec", "service_name": "osd.hdd-ec", "service_type": "osd",  
>>>>> "spec": {"crush_device_class": "hdd-ec", "data_devices":  
>>>>> {"rotational": 1, "size": "10G"}, "filter_logic": "AND",  
>>>>> "objectstore": "bluestore"}}}
>>>>>
>>>>> And it has the OSDSPEC_AFFINITY set:
>>>>>
>>>>> cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=hdd-ec',  
>>>>> '--image',  
>>>>> 'registry.domain/ceph@sha256:ca901f9ff84d77f8734afad20556775f0ebaea6c62af8cca733161f5338d3f6c', '--timeout', '895', 'ceph-volume', '--fsid', '7d60533e-7e9e-11ef-b140-fa163e2ad8c5', '--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/vdb', '/dev/vdc', '/dev/vdd', '/dev/vdf', '/dev/vdg', '/dev/vdh', '--yes',  
>>>>> '--no-systemd']
>>>>>
>>>>> But the OSDs still are created with hdd device class:
>>>>>
>>>>> ceph osd tree
>>>>> ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
>>>>> -1         0.05878  root default
>>>>> -3         0.05878      host soc9-ceph
>>>>> 0    hdd  0.00980          osd.0           up   1.00000  1.00000
>>>>> 1    hdd  0.00980          osd.1           up   1.00000  1.00000
>>>>> 2    hdd  0.00980          osd.2           up   1.00000  1.00000
>>>>> 3    hdd  0.00980          osd.3           up   1.00000  1.00000
>>>>> 4    hdd  0.00980          osd.4           up   1.00000  1.00000
>>>>> 5    hdd  0.00980          osd.5           up   1.00000  1.00000
>>>>>
>>>>> I have tried it with two different indentations:
>>>>>
>>>>> spec:
>>>>> crush_device_class: hdd-ec
>>>>>
>>>>> and as seen above:
>>>>>
>>>>> crush_device_class: hdd-ec
>>>>> placement:
>>>>> label: osd
>>>>> spec:
>>>>>
>>>>> According to the docs [0], it's not supposed to be indented, so  
>>>>> my current spec seems valid. But I see in the mgr log with  
>>>>> debug_mgr 10 that apparently it is parsed with indentation:
>>>>>
>>>>> 2024-10-03T09:59:23.029+0000 7efef1cc6700  0 [cephadm DEBUG  
>>>>> cephadm.services.osd] Translating DriveGroup  
>>>>> <DriveGroupSpec.from_json(yaml.safe_load('''service_type: osd
>>>>> service_id: hdd-ec
>>>>> service_name: osd.hdd-ec
>>>>> placement:
>>>>> label: osd
>>>>> spec:
>>>>> crush_device_class: hdd-ec
>>>>> data_devices:
>>>>>  rotational: 1
>>>>>  size: 10G
>>>>> filter_logic: AND
>>>>> objectstore: bluestore
>>>>> '''))> to ceph-volume command
>>>>>
>>>>> Now I'm wondering how it's actually supposed to work. Yesterday  
>>>>> we saw the same behaviour on a customer cluster as well with  
>>>>> Quincy 17.2.7. This is Reef 18.2.2.
>>>>>
>>>>> Trying to create it manually also doesn't work as expected:
>>>>>
>>>>> soc9-ceph:~ # ceph orch daemon add osd  
>>>>> soc9-ceph:data_devices=/dev/vdf,crush_device_class=hdd-ec
>>>>> Created osd(s) 3 on host 'soc9-ceph'
>>>>>
>>>>> soc9-ceph:~ # ceph osd tree | grep osd.3
>>>>> 3    hdd  0.00980          osd.3           up   1.00000  1.00000
>>>>>
>>>>> This is in the mgr debug output from the manual creation:
>>>>>
>>>>> 2024-10-03T10:06:00.329+0000 7efeeecc0700  0 [orchestrator DEBUG  
>>>>> root] _oremote orchestrator ->  
>>>>> cephadm.create_osds(*(DriveGroupSpec.from_json(yaml.safe_load('''service_type:  
>>>>> osd
>>>>> service_name: osd
>>>>> placement:
>>>>> host_pattern: soc9-ceph
>>>>> spec:
>>>>> crush_device_class: hdd-ec
>>>>> data_devices:
>>>>>  paths:
>>>>>  - /dev/vdf
>>>>> filter_logic: AND
>>>>> objectstore: bluestore
>>>>> ''')),), **{})
>>>>> 2024-10-03T10:06:00.333+0000 7efeeecc0700  0 [cephadm DEBUG  
>>>>> cephadm.services.osd] Processing DriveGroup  
>>>>> DriveGroupSpec.from_json(yaml.safe_load('''service_type: osd
>>>>> service_name: osd
>>>>> placement:
>>>>> host_pattern: soc9-ceph
>>>>> spec:
>>>>> crush_device_class: hdd-ec
>>>>> data_devices:
>>>>>  paths:
>>>>>  - /dev/vdf
>>>>> filter_logic: AND
>>>>> objectstore: bluestore
>>>>> '''))
>>>>>
>>>>> So parsing the manual command also results in the indented  
>>>>> crush_device_class. Am I doing something wrong here?
>>>>>
>>>>> Thanks!
>>>>> Eugen
>>>>>
>>>>> [0]  
>>>>> https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux