cephadm does not redeploy OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We are running a ceph cluster managed with cephadm v16.2.13. Recently we needed to change a disk, and we replaced it with:

ceph orch osd rm 37 --replace.

It worked fine, the disk was drained and the OSD marked as destroy.

However, after changing the disk, no OSD was created. Looking to the db device, the partition for db for OSD 37 was still there. So we destroyed it using:
ceph-volume lvm zap --osd-id=37 --destroy.

But we still have no OSD redeployed.
Here we have our spec:

---
service_type: osd
service_id: osd-hdd
placement:
label: osds
spec:
data_devices:
rotational: 1
encrypted: true
db_devices:
size: '1TB:2TB' db_slots: 12

And the disk looks good:

HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
node05 /dev/nvme2n1 ssd SAMSUNG MZPLJ1T6HBJR-00007_S55JNG0R600357 1600G 12m ago LVM detected, locked

node05 /dev/sdk hdd SEAGATE_ST10000NM0206_ZA21G2170000C7240KPF 10.0T Yes 12m ago

And VG on db_device looks to have enough space:
ceph-33b06f1a-f6f6-57cf-9ca8-6e4aa81caae0 1 11 0 wz--n- <1.46t 173.91g

If I remove the db_devices and db_slots from the specs, and do a dry run, the orchestrator seems to see the new disk as available:

ceph orch apply -i osd_specs.yml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any of these conditions change, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.
####################
SERVICESPEC PREVIEWS
####################
+---------+------+--------+-------------+
|SERVICE |NAME |ADD_TO |REMOVE_FROM |
+---------+------+--------+-------------+
+---------+------+--------+-------------+
################
OSDSPEC PREVIEWS
################
+---------+---------+-------------------------+----------+----+-----+
|SERVICE |NAME |HOST |DATA |DB |WAL |
+---------+---------+-------------------------+----------+----+-----+
|osd |osd-hdd |node05 |/dev/sdk |- |- |
+---------+---------+-------------------------+----------+----+-----+

But as soon as I add db_devices back, the orchestrator is happy as it is, like there is nothing to do:

ceph orch apply -i osd_specs.yml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any of these conditions change, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.
####################
SERVICESPEC PREVIEWS
####################
+---------+------+--------+-------------+
|SERVICE |NAME |ADD_TO |REMOVE_FROM |
+---------+------+--------+-------------+
+---------+------+--------+-------------+
################
OSDSPEC PREVIEWS
################
+---------+------+------+------+----+-----+
|SERVICE |NAME |HOST |DATA |DB |WAL |
+---------+------+------+------+----+-----+

I do not know why ceph will not use this disk, and I do not know where to look. It seems logs are not saying anything. And the weirdest thing, another disk was replaced on the same machine, and it went without any issues.

Luis Domingues
Proton AG
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux