Adding new server to existing ceph cluster - with separate block.db on NVME

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am trying to add a new server to an existing cluster, but cannot get the OSDs to create correctly
When I try
Cephadm ceph-volume lvm create, it returns nothing but the container info.

[root@hiho ~]# cephadm ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3
Inferring fsid fe3a7cb0-69ca-11eb-8d45-c86000d08867
Using ceph image with id 'cc65afd6173a' and tag '<none>' created on 2022-10-17 23:41:41 +0000 UTC
quay.io/ceph/ceph@sha256:2b73ccc9816e0a1ee1dfbe21ba9a8cc085210f1220f597b5050ebfcac4bdd346<mailto:quay.io/ceph/ceph@sha256:2b73ccc9816e0a1ee1dfbe21ba9a8cc085210f1220f597b5050ebfcac4bdd346>

so I tried cephadm shell,
and
ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3


Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 41dafd4d-0579-4119-acca-6db31586a10f
stderr: 2023-03-28T03:32:27.436+0000 7fa5d6253700 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
stderr: 2023-03-28T03:32:27.436+0000 7fa5d6253700 -1 AuthRegistry(0x7fa5d0060d70) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
stderr: 2023-03-28T03:32:27.436+0000 7fa5d6253700 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
stderr: 2023-03-28T03:32:27.436+0000 7fa5d6253700 -1 AuthRegistry(0x7fa5d0063da0) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
stderr: 2023-03-28T03:32:27.437+0000 7fa5d6253700 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
stderr: 2023-03-28T03:32:27.437+0000 7fa5d6253700 -1 AuthRegistry(0x7fa5d6251ea0) no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
stderr: 2023-03-28T03:32:27.451+0000 7fa5ceffd700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
stderr: 2023-03-28T03:32:27.453+0000 7fa5cf7fe700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
stderr: 2023-03-28T03:32:27.473+0000 7fa5cffff700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
stderr: 2023-03-28T03:32:27.474+0000 7fa5d6253700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication
stderr: [errno 13] RADOS permission denied (error connecting to the cluster)
-->  RuntimeError: Unable to create a new OSD id

I then copy the key ring file into the container using scp, but by that time the orchestrator created OSDs on the drives, so I have to delete the OSDs and start over.

Then if I get the timing just right, I get this (from within cephadm shell):

[ceph: root@hiho bootstrap-osd]# ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new e6e316d4-670d-4a9b-a50c-bc14d57394a3
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-4d95584a-df28-4e21-9480-09a13f1fb804 /dev/sdd
stdout: Physical volume "/dev/sdd" successfully created.
stdout: Volume group "ceph-4d95584a-df28-4e21-9480-09a13f1fb804" successfully created
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 953861 -n osd-block-e6e316d4-670d-4a9b-a50c-bc14d57394a3 ceph-4d95584a-df28-4e21-9480-09a13f1fb804
stdout: Logical volume "osd-block-e6e316d4-670d-4a9b-a50c-bc14d57394a3" created.
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 119209 -n osd-db-9fc4f199-2c95-4ca7-a35c-ef4b08c86804 ceph-948a633c-420e-4f55-8515-b33e1c0ef18c
stderr: Volume group "ceph-948a633c-420e-4f55-8515-b33e1c0ef18c" has insufficient free space (0 extents): 119209 required.
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.12 --yes-i-really-mean-it
stderr: purged osd.12
-->  RuntimeError: Unable to find any LV for zapping OSD: 12
[ceph: root@hiho bootstrap-osd]# ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3
-->  RuntimeError: Device /dev/sdd has a filesystem.
[ceph: root@hiho bootstrap-osd]# ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new b93e1a8a-af88-431c-b705-f49d717b050f
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83 /dev/sdd
stdout: Physical volume "/dev/sdd" successfully created.
stdout: Volume group "ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83" successfully created
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 953861 -n osd-block-b93e1a8a-af88-431c-b705-f49d717b050f ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83
stdout: Logical volume "osd-block-b93e1a8a-af88-431c-b705-f49d717b050f" created.
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-4abdb6f8-c891-43f8-8135-dd8470f80130 /dev/nvme0n1p3
stdout: Physical volume "/dev/nvme0n1p3" successfully created.
stdout: Volume group "ceph-4abdb6f8-c891-43f8-8135-dd8470f80130" successfully created
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 119209 -n osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13 ceph-4abdb6f8-c891-43f8-8135-dd8470f80130
stdout: Wiping ceph_bluestore signature on /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13.
stdout: Logical volume "osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13" created.
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-12
Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-5
Running command: /usr/bin/ln -s /dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f /var/lib/ceph/osd/ceph-12/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-12/activate.monmap
stderr: got monmap epoch 33
--> Creating keyring file for osd.12
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/keyring
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12/
Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-8
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 12 --monmap /var/lib/ceph/osd/ceph-12/activate.monmap --keyfile - --bluestore-block-db-path /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13 --osd-data /var/lib/ceph/osd/ceph-12/ --osd-uuid b93e1a8a-af88-431c-b705-f49d717b050f --setuser ceph --setgroup ceph
stderr: 2023-03-28T03:23:21.180+0000 7f0a40fd63c0 -1 bluestore(/var/lib/ceph/osd/ceph-12/) _read_fsid unparsable uuid
--> ceph-volume lvm prepare successful for: /dev/sdd
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f --path /var/lib/ceph/osd/ceph-12 --no-mon-config
Running command: /usr/bin/ln -snf /dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f /var/lib/ceph/osd/ceph-12/block
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-12/block
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-5
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-12
Running command: /usr/bin/ln -snf /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13 /var/lib/ceph/osd/ceph-12/block.db
Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-8
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-12/block.db
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-8
Running command: /usr/bin/systemctl enable ceph-volume@lvm-12-b93e1a8a-af88-431c-b705-f49d717b050f
stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-12-b93e1a8a-af88-431c-b705-f49d717b050f.service<mailto:/etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-12-b93e1a8a-af88-431c-b705-f49d717b050f.service> -> /usr/lib/systemd/system/ceph-volume@.service<mailto:/usr/lib/systemd/system/ceph-volume@.service>.
Running command: /usr/bin/systemctl enable --runtime ceph-osd@12
stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@12.service<mailto:/run/systemd/system/ceph-osd.target.wants/ceph-osd@12.service> -> /usr/lib/systemd/system/ceph-osd@.service<mailto:/usr/lib/systemd/system/ceph-osd@.service>.
Running command: /usr/bin/systemctl start ceph-osd@12
stderr: Failed to connect to bus: No such file or directory
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.12 --yes-i-really-mean-it
stderr: purged osd.12
--> Zapping: /dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f
--> Unmounting /var/lib/ceph/osd/ceph-12
Running command: /usr/bin/umount -v /var/lib/ceph/osd/ceph-12
stderr: umount: /var/lib/ceph/osd/ceph-12 unmounted
Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83/osd-block-b93e1a8a-af88-431c-b705-f49d717b050f bs=1M count=10 conv=fsync
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0771422 s, 136 MB/s
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgremove -v -f ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83
stderr: Removing ceph--e91c86b4--613f--45b0--b9d1--3bb76ed10f83-osd--block--b93e1a8a--af88--431c--b705--f49d717b050f (253:5)
stderr: Releasing logical volume "osd-block-b93e1a8a-af88-431c-b705-f49d717b050f"
  Archiving volume group "ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83" metadata (seqno 5).
stdout: Logical volume "osd-block-b93e1a8a-af88-431c-b705-f49d717b050f" successfully removed.
stderr: Removing physical volume "/dev/sdd" from volume group "ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83"
stdout: Volume group "ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83" successfully removed
stderr: Creating volume group backup "/etc/lvm/backup/ceph-e91c86b4-613f-45b0-b9d1-3bb76ed10f83" (seqno 6).
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/pvremove -v -f -f /dev/sdd
stdout: Labels on physical volume "/dev/sdd" successfully wiped.
--> Zapping: /dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13
Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130/osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13 bs=1M count=10 conv=fsync
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0131332 s, 798 MB/s
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-4abdb6f8-c891-43f8-8135-dd8470f80130
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgremove -v -f ceph-4abdb6f8-c891-43f8-8135-dd8470f80130
stderr: Removing ceph--4abdb6f8--c891--43f8--8135--dd8470f80130-osd--db--5bab4d99--e2e5--48e7--aee6--d1a2103b9d13 (253:8)
stderr: Releasing logical volume "osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13"
  Archiving volume group "ceph-4abdb6f8-c891-43f8-8135-dd8470f80130" metadata (seqno 5).
stdout: Logical volume "osd-db-5bab4d99-e2e5-48e7-aee6-d1a2103b9d13" successfully removed.
stderr: Removing physical volume "/dev/nvme0n1p3" from volume group "ceph-4abdb6f8-c891-43f8-8135-dd8470f80130"
stdout: Volume group "ceph-4abdb6f8-c891-43f8-8135-dd8470f80130" successfully removed
stderr: Creating volume group backup "/etc/lvm/backup/ceph-4abdb6f8-c891-43f8-8135-dd8470f80130" (seqno 6).
Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/pvremove -v -f -f /dev/nvme0n1p3
stdout: Labels on physical volume "/dev/nvme0n1p3" successfully wiped.
--> Zapping successful for OSD: 12
-->  RuntimeError: command returned non-zero exit status: 1


I have 3 other servers with the split data and block db that I was able to install just using cephadm, so I am not sure what is off on this one.


Is there any way to either add the block.db to the already created OSDs or get around the missing bootstrap.osd/ceph or manually configure the disk and block db?

I have also tried
ceph orch apply osd --all-available-devices --unmanaged=true
to stop ceph from trying to take the OSD, but it still does

Thanks,
Rob


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux