On 3/22/21 3:52 PM, Nico Schottelius wrote:
Hello,
follow up from my mail from 2020 [0], it seems that OSDs sometimes have
"multiple classes" assigned:
[15:47:15] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush rm-device-class osd.4
done removing class of osd(s): 4
[15:47:17] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush rm-device-class osd.4
osd.4 belongs to no class,
[15:47:20] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush set-device-class xruk osd.4
set osd(s) 4 to class 'xruk'
[15:47:45] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush set-device-class xruk osd.4
osd.4 already set to class xruk. set-device-class item id 4 name 'osd.4' device_class 'xruk': no change.
[15:47:47] server6.place6:/var/lib/ceph/osd/ceph-4# /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph --setuser ceph --setgroup ceph
2021-03-22 15:48:02.773 7fe2f81e4d80 -1 osd.4 94608 log_to_monitors {default=true}
2021-03-22 15:48:02.777 7fe2f81e4d80 -1 osd.4 94608 mon_cmd_maybe_osd_create fail: 'osd.4 has already bound to class 'xruk', can not reset class to 'hdd'; use 'ceph osd crush rm-device-class <id>' to remove old class first': (16) Device or resource busy
[15:48:02] server6.place6:/var/lib/ceph/osd/ceph-4#
[15:48:02] server6.place6:/var/lib/ceph/osd/ceph-4#
We are running ceph 14.2.9.
As written before, it also seems that the affected OSD is peering with
OSDs from the wrong class (hdd). Does anyone have a hint on how to fix
this?
Do you have: osd_class_update_on_start enabled?
On our cluster NVMe OSDs would try to wrongly add themselves to "SSD"
class (which didn't succeed). But maybe sometimes your OSDs do manage to
put themselve in a wrong class? Just guessing. But I would turn that
off. The same for this parameter:
osd_crush_update_on_start
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx