Re: Missing keyrings on upgraded cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, Adam.

Providing the keyring to the cephadm command worked, but the unwanted (but expected) side effect is that from cephadm perspective it's a stray daemon. For some reason the orchestrator did apply the desired drivegroup when I tried to reproduce this morning, but then again failed just now when I wanted to get rid of the stray daemon. This is one of the most annoying things with cephadm, I still don't fully understand when it will correctly apply the identical drivegroup.yml and when not. Anyway, the conclusion is to not interfere with cephadm (nothing new here), but since the drivegroup was not applied correctly I assumed I had to "help out" a bit by manually deploying an OSD.

Thanks,
Eugen

Zitat von Adam King <adking@xxxxxxxxxx>:

Going off of

ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json

you could try passing "--keyring <bootstrap-osd-keyring" to the cephadm
ceph-volume command. Something like  'cephadm ceph-volume --keyring
<bootstrap-osd-keyring> -- lvm create'. I'm guessing it's trying to run the
osd tree command within a container and I know cephadm mounts keyrings
passed to the ceph-volume command as
"/var/lib/ceph/bootstrap-osd/ceph.keyring" inside the container.

On Mon, Feb 20, 2023 at 6:35 AM Eugen Block <eblock@xxxxxx> wrote:

Hi *,

I was playing around on an upgraded test cluster (from N to Q),
current version:

     "overall": {
         "ceph version 17.2.5
(98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 18
     }

I tried to replace an OSD after destroying it with 'ceph orch osd rm
osd.5 --replace'. The OSD was drained successfully and marked as
"destroyed" as expected, the zapping also worked. At this point I
didn't have an osd spec in place because all OSDs were adopted during
the upgrade process. So I created a new spec which was not applied
successfully (I'm wondering if there's another/new issue with
ceph-volume, but that's not the focus here), so I tried it manually
with 'cephadm ceph-volume lvm create'. I'll add the output at the end
for a better readability. Apparently, there's no boostrap-osd keyring
for cephadm so it can't search the desired osd_id in the osd tree, the
command it tries is this:

ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json

In the local filesystem the required keyring is present, though:

nautilus:~ # cat /var/lib/ceph/bootstrap-osd/ceph.keyring
[client.bootstrap-osd]
         key = AQBOCbpgixIsOBAAgBzShsFg/l1bOze4eTZHug==
         caps mgr = "allow r"
         caps mon = "profile bootstrap-osd"

Is there something missing during the adoption process? Or are the
docs lacking some upgrade info? I found a section about putting
keyrings under management [1], but I'm not sure if that's what's
missing here.
Any insights are highly appreciated!

Thanks,
Eugen

[1]

https://docs.ceph.com/en/quincy/cephadm/operations/#putting-a-keyring-under-management


---snip---
nautilus:~ # cephadm ceph-volume lvm create --osd-id 5 --data /dev/sde
--block.db /dev/sdb --block.db-size 5G
Inferring fsid <FSID>
Using recent ceph image
<LOCAL_REGISTRY>/ceph/ceph@sha256
:af50ec26db7ee177e1ec1b553a0d6a9dbad2c3cc0da2f8f46d012184a79d4f92
Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host
--stop-signal=SIGTERM --authfile=/etc/ceph/podman-auth.json --net=host
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk
--init -e
CONTAINER_IMAGE=<LOCAL_REGISTRY>/ceph/ceph@sha256:af50ec26db7ee177e1ec1b553a0d6a9dbad2c3cc0da2f8f46d012184a79d4f92
-e NODE_NAME=nautilus -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/run/ceph/<FSID>:/var/run/ceph:z -v
/var/log/ceph/<FSID>:/var/log/ceph:z -v
/var/lib/ceph/<FSID>/crash:/var/lib/ceph/crash:z -v /dev:/dev -v
/run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpuydvbhuk:/etc/ceph/ceph.conf:z
<LOCAL_REGISTRY>/ceph/ceph@sha256:af50ec26db7ee177e1ec1b553a0d6a9dbad2c3cc0da2f8f46d012184a79d4f92
lvm create --osd-id 5 --data /dev/sde --block.db /dev/sdb --block.db-size
5G
/usr/bin/podman: stderr time="2023-02-20T09:02:49+01:00" level=warning
msg="Path \"/etc/SUSEConnect\" from \"/etc/containers/mounts.conf\"
doesn't exist, skipping"
/usr/bin/podman: stderr time="2023-02-20T09:02:49+01:00" level=warning
msg="Path \"/etc/zypp/credentials.d/SCCcredentials\" from
\"/etc/containers/mounts.conf\" doesn't exist, skipping"
/usr/bin/podman: stderr Running command: /usr/bin/ceph-authtool
--gen-print-key
/usr/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph
--name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.848+0000
7fd255e30700 -1 auth: unable to find a keyring on
/etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or
directory
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.848+0000
7fd255e30700 -1 AuthRegistry(0x7fd250060d50) no keyring found at
/etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling
cephx
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.852+0000
7fd255e30700 -1 auth: unable to find a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.852+0000
7fd255e30700 -1 AuthRegistry(0x7fd250060d50) no keyring found at
/var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.856+0000
7fd255e30700 -1 auth: unable to find a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.856+0000
7fd255e30700 -1 AuthRegistry(0x7fd250065910) no keyring found at
/var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.856+0000
7fd255e30700 -1 auth: unable to find a keyring on
/var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
/usr/bin/podman: stderr  stderr: 2023-02-20T08:02:50.856+0000
7fd255e30700 -1 AuthRegistry(0x7fd255e2eea0) no keyring found at
/var/lib/ceph/bootstrap-osd/ceph.keyring, disabling cephx
/usr/bin/podman: stderr  stderr: [errno 2] RADOS object not found
(error connecting to the cluster)
/usr/bin/podman: stderr Traceback (most recent call last):
/usr/bin/podman: stderr   File "/usr/sbin/ceph-volume", line 11, in
<module>
/usr/bin/podman: stderr     load_entry_point('ceph-volume==1.0.0',
'console_scripts', 'ceph-volume')()
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in
__init__
/usr/bin/podman: stderr     self.main(self.argv)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59,
in newfunc
/usr/bin/podman: stderr     return f(*a, **kw)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in
main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194,
in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py",
line 46, in main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, self.argv)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194,
in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
line 77, in main
/usr/bin/podman: stderr     self.create(args)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16,
in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py",
line 26, in create
/usr/bin/podman: stderr     prepare_step.safe_prepare(args)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py",
line 252, in safe_prepare
/usr/bin/podman: stderr     self.prepare()
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16,
in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py",
line 292, in prepare
/usr/bin/podman: stderr     self.osd_id =
prepare_utils.create_id(osd_fsid, json.dumps(secrets),
osd_id=self.args.osd_id)
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line
166, in create_id
/usr/bin/podman: stderr     if osd_id_available(osd_id):
/usr/bin/podman: stderr   File
"/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line
204, in osd_id_available
/usr/bin/podman: stderr     raise RuntimeError('Unable check if OSD id
exists: %s' % osd_id)
/usr/bin/podman: stderr RuntimeError: Unable check if OSD id exists: 5
Traceback (most recent call last):
   File "/usr/sbin/cephadm", line 9170, in <module>
     main()
   File "/usr/sbin/cephadm", line 9158, in main
     r = ctx.func(ctx)
   File "/usr/sbin/cephadm", line 1917, in _infer_config
     return func(ctx)
   File "/usr/sbin/cephadm", line 1877, in _infer_fsid
     return func(ctx)
   File "/usr/sbin/cephadm", line 1945, in _infer_image
     return func(ctx)
   File "/usr/sbin/cephadm", line 1835, in _validate_fsid
     return func(ctx)
   File "/usr/sbin/cephadm", line 5294, in command_ceph_volume
     out, err, code = call_throws(ctx, c.run_cmd())
   File "/usr/sbin/cephadm", line 1637, in call_throws
     raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/podman run --rm --ipc=host
--stop-signal=SIGTERM --authfile=/etc/ceph/podman-auth.json --net=host
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk
--init -e
CONTAINER_IMAGE=<LOCAL_REGISTRY>/ceph/ceph@sha256:af50ec26db7ee177e1ec1b553a0d6a9dbad2c3cc0da2f8f46d012184a79d4f92
-e NODE_NAME=nautilus -e CEPH_USE_RANDOM_NONCE=1 -e
CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
/var/run/ceph/<FSID>:/var/run/ceph:z -v
/var/log/ceph/<FSID>:/var/log/ceph:z -v
/var/lib/ceph/<FSID>/crash:/var/lib/ceph/crash:z -v /dev:/dev -v
/run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
/tmp/ceph-tmpuydvbhuk:/etc/ceph/ceph.conf:z
<LOCAL_REGISTRY>/ceph/ceph@sha256:af50ec26db7ee177e1ec1b553a0d6a9dbad2c3cc0da2f8f46d012184a79d4f92
lvm create --osd-id 5 --data /dev/sde --block.db /dev/sdb --block.db-size
5G
---snip---
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux