Thanks Eugen and others for the advice. These are not, however, lvm-based OSDs. I can get a list of what is out there with: cephadm ceph-volume raw list and tried cephadm ceph-volume raw activate but it tells me I need to manually run activate. I was able to find the correct data disks with for example: ceph-bluestore-tool show-label --dev /dev/sda2 but on running e.g. cephadm ceph-volume raw activate --osd-id 20 --device /dev/sda --osd-uuid 74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal /dev/nvme0n1p1 (OSD ID inferred from the list of down OSDs) I got an error that "systemd support not yet implemented". On adding --no-systemd to the command, I get the response: stderr KeyError: 'osd_id' " The on-disk metadata indeed doesn't have an osd_id for most entries. For the one instance I can find with the osd_id key in the metadata, the "cephadm ceph-volume raw activate" completes but with no apparent change to the system. Is there any advice on how to recover the configuration with raw, not LVM, OSDs? And then once I have things added back in: the host is currently listed as offline in the output of "ceph orch host ls". How can it be re-added to this list? Thank you, Peter BTW full error message: Inferring fsid ed7b2c16-b053-45e2-a1fe-bf3474f90508 Using ceph image with id '59248721b0c7' and tag 'v17' created on 2024-04-24 16:06:51 +0000 UTC quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE= quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 -e NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 raw activate --osd-id 20 --device /dev/sda --osd-uuid 74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal /dev/nvme0n1p1 --no-systemd /usr/bin/docker: stderr Traceback (most recent call last): /usr/bin/docker: stderr File "/usr/sbin/ceph-volume", line 11, in <module> /usr/bin/docker: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')() /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__ /usr/bin/docker: stderr self.main(self.argv) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc /usr/bin/docker: stderr return f(*a, **kw) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main /usr/bin/docker: stderr terminal.dispatch(self.mapper, subcommand_args) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /usr/bin/docker: stderr instance.main() /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line 32, in main /usr/bin/docker: stderr terminal.dispatch(self.mapper, self.argv) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /usr/bin/docker: stderr instance.main() /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py", line 166, in main /usr/bin/docker: stderr systemd=not self.args.no_systemd) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root /usr/bin/docker: stderr return func(*a, **kw) /usr/bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py", line 79, in activate /usr/bin/docker: stderr osd_id = meta['osd_id'] /usr/bin/docker: stderr KeyError: 'osd_id' Traceback (most recent call last): File "/usr/sbin/cephadm", line 9679, in <module> main() File "/usr/sbin/cephadm", line 9667, in main r = ctx.func(ctx) File "/usr/sbin/cephadm", line 2116, in _infer_config return func(ctx) File "/usr/sbin/cephadm", line 2061, in _infer_fsid return func(ctx) File "/usr/sbin/cephadm", line 2144, in _infer_image return func(ctx) File "/usr/sbin/cephadm", line 2019, in _validate_fsid return func(ctx) File "/usr/sbin/cephadm", line 6272, in command_ceph_volume out, err, code = call_throws(ctx, c.run_cmd(), verbosity=CallVerbosity.QUIET_UNLESS_ERROR) File "/usr/sbin/cephadm", line 1807, in call_throws raise RuntimeError('Failed command: %s' % ' '.join(command)) RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE= quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 -e NODE_NAME=ceph-osd3 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/log/ceph/ed7b2c16-b053-45e2-a1fe-bf3474f90508:/var/log/ceph:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpjox0_hj0:/etc/ceph/ceph.conf:z quay.io/ceph/ceph@sha256:96f2a53bc3028eec16e790c6225e7d7acad8a48737a57ec14eea7ce036733233 raw activate --osd-id 20 --device /dev/sda --osd-uuid 74f4ce9c-4623-41b7-a7f9-cc81bb9467ef --block.db /dev/nvme1n1p1 --block.wal /dev/nvme0n1p1 --no-systemd On Wed, 24 Apr 2024 at 14:47, Eugen Block <eblock@xxxxxx> wrote: > In addition to Nico's response, three years ago I wrote a blog post > [1] about that topic, maybe that can help as well. It might be a bit > outdated, what it definitely doesn't contain is this command from the > docs [2] once the server has been re-added to the host list: > > ceph cephadm osd activate <host> > > Regards, > Eugen > > [1] > > https://heiterbiswolkig.blogs.nde.ag/2021/02/08/cephadm-reusing-osds-on-reinstalled-server/ > [2] > > https://docs.ceph.com/en/latest/cephadm/services/osd/#activate-existing-osds > > Zitat von Nico Schottelius <nico.schottelius@xxxxxxxxxxx>: > > > Hey Peter, > > > > the /var/lib/ceph directories mainly contain "meta data" that, depending > > on the ceph version and osd setup, can even be residing on tmpfs by > > default. > > > > Even if the data was on-disk, they are easy to recreate: > > > > > -------------------------------------------------------------------------------- > > [root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# ls -l > > total 28 > > lrwxrwxrwx 1 ceph ceph 8 Feb 7 12:12 block -> /dev/sde > > -rw------- 1 ceph ceph 37 Feb 7 12:12 ceph_fsid > > -rw------- 1 ceph ceph 37 Feb 7 12:12 fsid > > -rw------- 1 ceph ceph 56 Feb 7 12:12 keyring > > -rw------- 1 ceph ceph 6 Feb 7 12:12 ready > > -rw------- 1 ceph ceph 3 Feb 7 12:12 require_osd_release > > -rw------- 1 ceph ceph 10 Feb 7 12:12 type > > -rw------- 1 ceph ceph 3 Feb 7 12:12 whoami > > [root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# > > > -------------------------------------------------------------------------------- > > > > We used to create OSDs manually on alpine linux some years ago using > > [0], you can check it out as an inspiration for what should be in which > > file. > > > > BR, > > > > Nico > > > > > > [0] > > > https://code.ungleich.ch/ungleich-public/ungleich-tools/src/branch/master/ceph/ceph-osd-create-start-alpine > > > > Peter van Heusden <pvh@xxxxxxxxxxx> writes: > > > >> Dear Ceph Community > >> > >> We have 5 OSD servers running Ceph v15.2.17. The host operating system > is > >> Ubuntu 20.04. > >> > >> One of the servers has suffered corruption to its boot operating system. > >> Using a system rescue disk it is possible to mount the root filesystem > but > >> it is not possible to boot the operating system at the moment. > >> > >> The OSDs are configured with (spinning disk) data drives, WALs and DBs > on > >> partitions of SSDs, but from my examination of the filesystem the > >> configuration in /var/lib/ceph appears to be corrupted. > >> > >> So my question is: what is the best option for repair going forward? Is > it > >> possible to do a clean install of the operating system and scan the > >> existing drives in order to reconstruct the OSD configuration? > >> > >> Thank you, > >> Peter > >> P.S. the cause of the original corruption is likely due to an unplanned > >> power outage, an event that hopefully will not recur. > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx