Re: Orchestrator is internally ignoring applying a spec against SSDs, apparently determining they're rotational.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!  So I nuked the cluster, zapped all the disks, and redeployed.

Then I applied this osd spec (this time via the dashboard since I was full
of hope):

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  rotational: 1
db_devices:
  rotational: 0

--dry-run showed exactly what I hoped to see.

Upon application, hosts 1-4 worked just fine.  Host 5... not so much. I see
logical volumes being created, but no OSDs are coming online.  Moreover,
it's taken cephadm on host 5 days to get just a few LVs built.

I nuked all the LV's on that host, then zapped with sgdisk, then dd'd the
drives with /dev/urandom, then rebooted... the problem persists!
cephadm started making vg/lv but no new OSDs.

This wall of text might have a hint... but it's not true!  There's no
partition on these!  They've been wiped with /dev/urandom!

Here's a dump of a relevant part of /var/log/ceph/cephadm.log.  Since
formatting is stripped, I've spaced out the interesting part.  It's a shame
this process is still so unreliable.

2021-10-05 20:43:41,499 INFO Non-zero exit code 1 from /usr/bin/docker run
--rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint
/usr/sbin/ceph-volume --privileged --group-add=disk --init -e
CONTAINER_IMAGE=
quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45
-e NODE_NAME=ceph05 -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_OSDSPEC_AFFINITY=dashboard-admin-1633379370439 -v
/var/run/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65:/var/run/ceph:z -v
/var/log/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65:/var/log/ceph:z -v
/var/lib/ceph/23e192fe-221d-11ec-a2cb-a16209e26d65/crash:/var/lib/ceph/crash:z
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmpu5c6jw0u:/etc/ceph/ceph.conf:z
-v /tmp/ceph-tmpk1wgba4u:/var/lib/ceph/bootstrap-osd/ceph.keyring:z
quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45
lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
/dev/sdg /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo
/dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx
--wal-devices /dev/sdp /dev/sdq --yes --no-systemd
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> passed data
devices: 21 physical, 0 LVM
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> relative data
size: 1.0
2021-10-05 20:43:41,499 INFO /usr/bin/docker: stderr --> passed block_wal
devices: 2 physical, 0 LVM
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph-authtool --gen-print-key
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
a97fda7a-586f-4ced-86e0-b0a18e081ec7
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/vgcreate --force --yes ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27
/dev/sdn
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr  stdout: Physical
volume "/dev/sdn" successfully created.
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr  stdout: Volume group
"ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27" successfully created
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/lvcreate --yes -l 238467 -n
osd-block-a97fda7a-586f-4ced-86e0-b0a18e081ec7
ceph-19158c90-90e6-4a37-98e2-7e0e45cd5e27
2021-10-05 20:43:41,500 INFO /usr/bin/docker: stderr  stdout: Logical
volume "osd-block-a97fda7a-586f-4ced-86e0-b0a18e081ec7" created.
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr Running command:
/usr/sbin/vgcreate --force --yes ceph-84b7458f-4888-41a7-a6d6-031d85bfc9e4
/dev/sdp

2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr  *stderr: Cannot use
/dev/sdp: device is partitioned*

2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr   Command requires all
devices to be found.
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr --> Was unable to
complete a new OSD, will rollback changes
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr Running command:
/usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.84
--yes-i-really-mean-it
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr  stderr: purged osd.84
2021-10-05 20:43:41,501 INFO /usr/bin/docker: stderr -->  RuntimeError:
command returned non-zero exit status: 5
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux