Alright, now `ceph orch device ls` show the disk as locked: HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS node-osd1 /dev/sdah hdd LENOVO_ST18000NM004J_ZR5F5TH00000W413B814 18.0T No 10m ago locked I also noticed that in `ceph osd crush dump` where osd.2 was listed before in the device list now there's a mysterious "device2": { "devices": [ { "id": 0, "name": "osd.0", "class": "hdd" }, { "id": 1, "name": "osd.1", "class": "hdd" }, { "id": 2, "name": "device2" }, { "id": 3, "name": "osd.3", "class": "hdd" }, ... Even though lsblk shows a Ceph LVM volume created on the disk, it seems that it was not complete: # lsblk /dev/sdah NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdah 66:16 0 16.4T 0 disk └─ceph--157c7c44--b519--4f6d--a54b--ecd466cf81d0-osd--block--10ada574--b99c--466d--8ad6--2c97d17d1f66 253:59 0 16.4T 0 lvm # blkid /dev/sdah # lvs | grep 157c7c44 # vgs | grep 157c7c44 # pvs | grep sdah # It seems that OSD creation was in progress and somehow only got halfway there. Could it have to do with the fact that our cluster is not in the best shape, though it is recovery well? ceph -s below: cluster: id: 26315dca-383a-11ee-9d49-00620b4c2392 health: HEALTH_ERR 987 scrub errors Possible data damage: 17 pgs inconsistent 2192 pgs not deep-scrubbed in time 2192 pgs not scrubbed in time services: mon: 5 daemons, quorum node-admin1,node-admin2,node-osd1,node-osd2,node-osd3 (age 17M) mgr: node-admin2.sipadf(active, since 17M), standbys: node-admin1.nwaovh mds: 2/2 daemons up, 2 standby osd: 167 osds: 167 up (since 3d), 167 in (since 2h); 237 remapped pgs data: volumes: 2/2 healthy pools: 9 pools, 2273 pgs objects: 475.75M objects, 1.1 PiB usage: 1.6 PiB used, 1.1 PiB / 2.7 PiB avail pgs: 73449562/2835243480 objects misplaced (2.591%) 2024 active+clean 228 active+remapped+backfilling 12 active+clean+inconsistent 5 active+remapped+inconsistent+backfilling 4 active+remapped+backfill_wait io: client: 56 MiB/s wr, 0 op/s rd, 97 op/s wr recovery: 1.0 GiB/s, 440 objects/s progress: Global Recovery Event (7w) [=========================...] (remaining: 6d) ________________________________ From: Gustavo Garcia Rondina <grondina@xxxxxxxxxxxx> Sent: Thursday, March 6, 2025 4:04 PM To: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Created no osd(s) on host, already created? Hello list, We have a Ceph cluster with two management nodes and six data nodes. Each data node has 28 HDD disks. One disk recently failed in one of the nodes, corresponding to osd.2. To replace the disk, we took the osd.2 out, stopped it, and after a few days removed it, basically: ceph osd out osd.2 ceph osd ok-to-stop osd.2 Once Ok to stop, then: ceph orch daemon stop osd.2 Once stopped: ceph osd crush remove osd.2 ceph auth del osd.2 ceph osd rm osd.2 Then checked `ceph osd tree`, `ceph orch ps`, `ceph -s`, and all confirmed that the OSD was gone. We then proceeded to physically replace the failed drive on the node. It was /dev/sdac before, and after replacing it (hot swap) the system identified it as /dev/sdah. I zapped it on the node with `sgdisk --zap-all /dev/sdah` and, back on the management node, I could see that the disk was now showing up and marked as available with `ceph orch device ls`. Then I proceeded to add a new OSD to that disk with: `ceph orch daemon add osd node-osd1:/dev/sdah`, which then failed with: Created no osd(s) on host server-osd1; already created? Which was rather strange, because total OSDs was still showing as 1 less, no new OSD was showing up in `ceph osd tree`. What was even odder is that the disk that was showing as available in `ceph orch device ls` now shows as *not* available, and looking at lsblk's output in the node it seems that it was populated by Ceph: # lsblk /dev/sdah NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdah 66:16 0 16.4T 0 disk └─ceph--157c7c44--b519--4f6d--a54b--ecd466cf81d0-osd--block--10ada574--b99c--466d--8ad6--2c97d17d1f66 253:59 0 16.4T 0 lvm Any hints on how to proceed to get the OSD added back with this disk? Thank you for any suggestions! - Gustavo _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx