Re: Proper procedure to replace DB/WAL SSD

David Turner <drakonstein@xxxxxxxxx> · Sat, 24 Feb 2018 06:10:16 +0000

Caspar, it looks like your idea should work. Worst case scenario seems like the osd wouldn't start, you'd put the old SSD back in and go back to the idea to weight them to 0, backfilling, then recreate the osds. Definitely with a try in my opinion, and I'd love to hear your experience after.
Nico, it is not possible to change the WAL or DB size, location, etc after osd creation. If you want to change the configuration of the osd after creation, you have to remove it from the cluster and recreate it. There is no similar functionality to how you could move, recreate, etc filesystem osd journals. I think this might be on the radar as a feature, but I don't know for certain. I definitely consider it to be a regression of bluestore.

On Fri, Feb 23, 2018, 9:13 AM Nico Schottelius <nico.schottelius@xxxxxxxxxxx> wrote:

A very interesting question and I would add the follow up question:

Is there an easy way to add an external DB/WAL devices to an existing

OSD?

I suspect that it might be something on the lines of:

- stop osd

- create a link in ...ceph/osd/ceph-XX/block.db to the target device

- (maybe run some kind of osd mkfs ?)

- start osd

Has anyone done this so far or recommendations on how to do it?

Which also makes me wonder: what is actually the format of WAL and

BlockDB in bluestore? Is there any documentation available about it?

Best,

Nico

Caspar Smit <casparsmit@xxxxxxxxxxx> writes:

> Hi All,

>

> What would be the proper way to preventively replace a DB/WAL SSD (when it

> is nearing it's DWPD/TBW limit and not failed yet).

>

> It hosts DB partitions for 5 OSD's

>

> Maybe something like:

>

> 1) ceph osd reweight 0 the 5 OSD's

> 2) let backfilling complete

> 3) destroy/remove the 5 OSD's

> 4) replace SSD

> 5) create 5 new OSD's with seperate DB partition on new SSD

>

> When these 5 OSD's are big HDD's (8TB) a LOT of data has to be moved so i

> thought maybe the following would work:

>

> 1) ceph osd set noout

> 2) stop the 5 OSD's (systemctl stop)

> 3) 'dd' the old SSD to a new SSD of same or bigger size

> 4) remove the old SSD

> 5) start the 5 OSD's (systemctl start)

> 6) let backfilling/recovery complete (only delta data between OSD stop and

> now)

> 6) ceph osd unset noout

>

> Would this be a viable method to replace a DB SSD? Any udev/serial nr/uuid

> stuff preventing this to work?

>

> Or is there another 'less hacky' way to replace a DB SSD without moving too

> much data?

>

> Kind regards,

> Caspar

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--

Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com