You should either zap the devices with
ceph orch device zap my_hostname my_path --force
or with ceph-volume directly on that host:
cephadm ceph-volume lvm zap --destroy /dev/sdX
IIRC there's a backup of the partition table at the end of the
partition. I would expect ceph-volume to identify that those drives
are not available but apparently they seem available?
If 4 of 5 nodes have successfully created OSDs, could you set the osd
specs to "unmanaged: true" and then zap all OSD devices on that
failing host again with 'ceph orch device zap'?
If it finishes successfully, could you then run this command on that
failed OSD host and paste it here:
cephadm ceph-volume inventory
and maybe also this:
lsblk -o name,rota,size
Zitat von Chris <hagfelsh@xxxxxxxxx>:
Hi! So I nuked the cluster, zapped all the disks, and redeployed.
Then I applied this osd spec (this time via the dashboard since I was full
of hope):
service_type: osd
service_id: osd_spec_default
placement:
host_pattern: '*'
data_devices:
rotational: 1
db_devices:
rotational: 0
--dry-run showed exactly what I hoped to see.
Upon application, hosts 1-4 worked just fine. Host 5... not so much. I see
logical volumes being created, but no OSDs are coming online. Moreover,
it's taken cephadm on host 5 days to get just a few LVs built.
I nuked all the LV's on that host, then zapped with sgdisk, then dd'd the
drives with /dev/urandom, then rebooted... the problem persists!
cephadm started making vg/lv but no new OSDs.
This wall of text might have a hint... but it's not true! There's no
partition on these! They've been wiped with /dev/urandom!
Here's a dump of a relevant part of /var/log/ceph/cephadm.log. Since
formatting is stripped, I've spaced out the interesting part. It's a shame
this process is still so unreliable.
2021-10-05 20:43:41,499 INFO Non-zero exit code 1 from /usr/bin/docker run
--rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint
/usr/sbin/ceph-volume --privileged --group-add=disk --init -e
CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45
-e NODE_NAME=ceph05 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_OSDSPEC_AFFINITY=dashboard-admin-1633379370439 -v
/var/run/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65:/var/run/ceph:z -v
/var/log/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65:/var/log/ceph:z -v
/var/lib/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65/crash:/var/lib/ceph/crash:z
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmpu5c6jw0u:/etc/ceph/ceph.conf:z
-v /tmp/ceph-tmpk1wgba4u:/var/lib/ceph/bootstrap-osd/ceph.keyring:z
quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45
lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
/dev/sdg /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo
/dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
--wal-devices /dev/sdp /dev/sdq --yes --no-systemd
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> passed data
devices: 21 physical, 0 LVM
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> relative data
size: 1.0
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> passed block_wal
devices: 2 physical, 0 LVM
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph-authtool --gen-print-key
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
a97fda7a-586f-4ced-86e0-b0a18e081ec7
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/vgcreate --force --yes ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27
/dev/sdn
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr stdout: Physical
volume "/dev/sdn" successfully created.
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr stdout: Volume group
"ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27" successfully created
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/lvcreate --yes -l 238467 -n
osd-block-a97fda7a-586f-4ced-86e0-b0a18e081ec7
ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr stdout: Logical
volume "osd-block-a97fda7a-586f-4ced-86e0-b0a18e081ec7" created.
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/vgcreate --force --yes ceph-84b7458f-4888-41a7-a6d6-031d85bfc9e4
/dev/sdp
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr *stderr: Cannot use
/dev/sdp: device is partitioned*
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr Command requires all
devices to be found.
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr --> Was unable to
complete a new OSD, will rollback changes
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.84
--yes-i-really-mean-it
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr stderr: purged osd.84
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr --> RuntimeError:
command returned non-zero exit status: 5
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx