Proper way to replace an OSD with a shared SSD for db/wal

"Hayashida, Mami" <mami.hayashida@xxxxxxx> · Thu, 7 Nov 2019 14:36:15 -0500

We are running the Mimic version of Ceph (13.2.6) and I would like to know a proper way of replacing a defective OSD disk that has its DB and WAL on a separate SSD drive which is shared with 9 other OSDs. More specifically, the failing disk for osd.327 is on /dev/sdai and its wal/db are on /dev/sdc, which is partitioned into 10 LVs, holding wal/db for osd.320-329. When I deployed it, I used pv/vg/lvcreate commands to make VG named ssd1, LV named db320, db321 and so on. Then I used the ceph-deploy command from an admin node (`ceph-deploy osd create --block-db=ssd1/db327 --data="" <node>`).  My main question is what to do about the separate wal/db data as this page (https://docs.ceph.com/docs/mimic/rados/operations/add-or-rm-osds/) does not seem to address the issue.

1) Do I need to erase the wal/db data on the ssd1/db327 Logical Volume? If so, how should I do that?
2) Assuming 1) is taken care of (and the "old" OSD is destroyed and the "bad" hard drive has been physically replaced with a new one), does this command look correct? `ceph-volume  lvm create --osd-id 327 --bluestore --data /dev/sdai --block.db ssd1/db327`

Mami Hayashida
Research Computing Associate
Univ. of Kentucky ITS Research Computing Infrastructure

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx