Re: Proper procedure to replace DB/WAL SSD

Alfredo Deza <adeza@xxxxxxxxxx> · Mon, 7 May 2018 14:13:59 -0400

On Wed, May 2, 2018 at 12:18 PM, Nicolas Huillard <nhuillard@xxxxxxxxxxx> wrote:
> Le dimanche 08 avril 2018 à 20:40 +0000, Jens-U. Mozdzen a écrit :
>> sorry for bringing up that old topic again, but we just faced a
>> corresponding situation and have successfully tested two migration
>> scenarios.
>
> Thank you very much for this update, as I needed to do exactly that,
> due to an SSD crash triggering hardware replacement.
> The block.db on the crashed SSD were lost, so the whole two OSDs
> depending on it were re-created. I also replaced two other bad SSDs
> before they failed, thus needed to effectively replace DB/WAL devices
> on the live cluster (2 SSDs on 2 hosts and 4 OSDs).
>
>> it is possible to move a separate WAL/DB to a new device, whilst
>> without changing the size. We have done this for multiple OSDs,
>> using
>> only existing (mainstream :) ) tools and have documented the
>> procedure
>> in
>> http://heiterbiswolkig.blogs.nde.ag/2018/04/08/migrating-bluestores-b
>> lock-db/
>> . It will *not* allow to separate WAL / DB after OSD creation, nor
>> does it allow changing the DB size.
>
> The lost OSD were still backfilling when I did the above procedure
> (data redundancy was high enough to risk losing one more node). I even
> mis-typed the "ceph osd set noout" command ("ceph osd unset noout"
> instead, effectively a no-op), and replaced 2 OSDs of a single host at
> the same time (thus taking more time than the 10 minutes before kicking
> the OSDs out, triggering even more data movement).
> Everything went cleanly though, thanks to your detailed commands, which
> I ran one at a time, thinking twice before each [Enter].
>
> I digged a bit into the LVM tags :
> * make a backup of all pv/vg/lv config : vgcfgbackup
> * check the backed-up tags : grep tags /etc/lvm/backup/*
>
> I then noticed that :
> * there are lots of "ceph.*=" tags
> * tags are still present on the old DB/WAL LVs (since I didn't remove
> them)
> * tags are absent from the new DB/WAL LVs (ditto, I didn't create
> them), which may be a problem later on...

This is absolutely going to be a problem for you if I understand that
these are handled by ceph-volume, in which case
it reads from these tags to be able to bring up the OSD.

> * I changed the ceph.db_device= tag, but there is also a ceph.db_uuid=
> tag which was not changed, and may or may not trigger a problem upon
> reboot (I don't know if this UUID is part of the dd'ed data)

For sure you can get into a situation where ceph-volume needs one of
these and can't find it and then it breaks.

>
> You effectively helped a lot! Thanks.
>
> --
> Nicolas Huillard
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com