replace failed disk in Luminous v12.2.2

Dietmar Rieder <dietmar.rieder@xxxxxxxxxxx> · Thu, 11 Jan 2018 10:30:49 +0100

Hello,

we have failed OSD disk in our Luminous v12.2.2 cluster that needs to
get replaced.

The cluster was initially deployed using ceph-deploy on Luminous
v12.2.0. The OSDs were created using

ceph-deploy osd create --bluestore cephosd-${osd}:/dev/sd${disk}
--block-wal /dev/nvme0n1 --block-db /dev/nvme0n1

Note we separated the bluestore data, wal and db.

We updated to Luminous v12.2.1 and further to Luminous v12.2.2.

With the last update we also let ceph-volume take over the OSDs using
"ceph-volume simple scan  /var/lib/ceph/osd/$osd" and "ceph-volume
simple activate ${osd} ${id}". All of this went smoothly.

Now wonder what is the correct way to replace a failed OSD block disk?

The docs for luminous [1] say:

REPLACING AN OSD

1. Destroy the OSD first:

ceph osd destroy {id} --yes-i-really-mean-it

2. Zap a disk for the new OSD, if the disk was used before for other
purposes. It’s not necessary for a new disk:

ceph-disk zap /dev/sdX

3. Prepare the disk for replacement by using the previously destroyed
OSD id:

ceph-disk prepare --bluestore /dev/sdX  --osd-id {id} --osd-uuid `uuidgen`

4. And activate the OSD:

ceph-disk activate /dev/sdX1

Initially this seems to be straight forward, but....

1. I'm not sure if there is something to do with the still existing
bluefs db and wal partitions on the nvme device for the failed OSD. Do
they have to be zapped ? If yes, what is the best way? There is nothing
mentioned in the docs.

2. Since we already let "ceph-volume simple" take over our OSDs I'm not
sure if we should now use ceph-volume or again ceph-disk (followed by
"ceph-vloume simple" takeover) to prepare and activate the OSD?

3. If we should use ceph-volume, then by looking at the luminous
ceph-volume docs [2] I find for both,

ceph-volume lvm prepare
ceph-volume lvm activate

that the bluestore option is either NOT implemented or NOT supported

activate:  [–bluestore] filestore (IS THIS A TYPO???) objectstore (not
yet implemented)
prepare: [–bluestore] Use the bluestore objectstore (not currently
supported)

So, now I'm completely lost. How is all of this fitting together in
order to replace a failed OSD?

4. More.... after reading some a recent threads on this list additional
questions are coming up:

According to the OSD replacement doc [1] :

"When disks fail, [...], OSDs need to be replaced. Unlike Removing the
OSD, replaced OSD’s id and CRUSH map entry need to be keep [TYPO HERE?
keep -> kept] intact after the OSD is destroyed for replacement."

but
http://tracker.ceph.com/issues/22642 seems to say that it is not
possible to reuse am OSD's id

So I'm quite lost with an essential and very basic seemingly simple task
of storage management.

Thanks for any help here.

~Dietmar

[1]: http://docs.ceph.com/docs/luminous/rados/operations/add-or-rm-osds/
[2]: http://docs.ceph.com/docs/luminous/man/8/ceph-volume/

-- 
_________________________________________
D i e t m a r  R i e d e r, Mag.Dr.
Innsbruck Medical University
Biocenter - Division for Bioinformatics
Email: dietmar.rieder@xxxxxxxxxxx
Web:   http://www.icbi.at

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com