Re: Reconstructing an OSD server when the boot OS is corrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In addition to Nico's response, three years ago I wrote a blog post [1] about that topic, maybe that can help as well. It might be a bit outdated, what it definitely doesn't contain is this command from the docs [2] once the server has been re-added to the host list:

ceph cephadm osd activate <host>

Regards,
Eugen

[1] https://heiterbiswolkig.blogs.nde.ag/2021/02/08/cephadm-reusing-osds-on-reinstalled-server/ [2] https://docs.ceph.com/en/latest/cephadm/services/osd/#activate-existing-osds

Zitat von Nico Schottelius <nico.schottelius@xxxxxxxxxxx>:

Hey Peter,

the /var/lib/ceph directories mainly contain "meta data" that, depending
on the ceph version and osd setup, can even be residing on tmpfs by
default.

Even if the data was on-disk, they are easy to recreate:

--------------------------------------------------------------------------------
[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]# ls -l
total 28
lrwxrwxrwx 1 ceph ceph  8 Feb  7 12:12 block -> /dev/sde
-rw------- 1 ceph ceph 37 Feb  7 12:12 ceph_fsid
-rw------- 1 ceph ceph 37 Feb  7 12:12 fsid
-rw------- 1 ceph ceph 56 Feb  7 12:12 keyring
-rw------- 1 ceph ceph  6 Feb  7 12:12 ready
-rw------- 1 ceph ceph  3 Feb  7 12:12 require_osd_release
-rw------- 1 ceph ceph 10 Feb  7 12:12 type
-rw------- 1 ceph ceph  3 Feb  7 12:12 whoami
[root@rook-ceph-osd-36-6876cdb479-4764r ceph-36]#
--------------------------------------------------------------------------------

We used to create OSDs manually on alpine linux some years ago using
[0], you can check it out as an inspiration for what should be in which
file.

BR,

Nico


[0] https://code.ungleich.ch/ungleich-public/ungleich-tools/src/branch/master/ceph/ceph-osd-create-start-alpine

Peter van Heusden <pvh@xxxxxxxxxxx> writes:

Dear Ceph Community

We have 5 OSD servers running Ceph v15.2.17. The host operating system is
Ubuntu 20.04.

One of the servers has suffered corruption to its boot operating system.
Using a system rescue disk it is possible to mount the root filesystem but
it is not possible to boot the operating system at the moment.

The OSDs are configured with (spinning disk) data drives, WALs and DBs on
partitions of SSDs, but from my examination of the filesystem the
configuration in /var/lib/ceph appears to be corrupted.

So my question is: what is the best option for repair going forward? Is it
possible to do a clean install of the operating system and scan the
existing drives in order to reconstruct the OSD configuration?

Thank you,
Peter
P.S. the cause of the original corruption is likely due to an unplanned
power outage, an event that hopefully will not recur.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux