Hi,
I just wanted to make sure that our latest findings reach the OP of
this thread. We posted it in a different thread [1] and hope this
helps some of you.
It is possible to migrate a journal from one partition to another
almost without downtime of the OSD. But it's *not* sufficient to dd
the journal to the new partition and replace the symlink. The OSD will
restart successfully only if the old partition still exists, and
you'll find references to it in /proc/fd/<PID>. Removing the old
partition will prevent the OSD from starting. You can find details in
the provided link [1].
We managed to replace the journals of six 1 TB OSDs residing on the
same host within 25 minutes in our production environment.
Note: this only applies if the wal/db already reside on a separate partition.
Currently, I'm looking for a way to extract the journal of an
all-in-one OSD (bluestore) into a separate partition, I thought maybe
"ceph-objectstore-tool --op dump-journal" could do the trick, but this
command doesn't work for me. Has anyone any insights on this?
Regards,
Eugen
[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-April/025930.html
<ronny+ceph-users@xxxxxxxx> -----
Datum: Fri, 17 Nov 2017 17:04:36 +0100
Von: Ronny Aasen <ronny+ceph-users@xxxxxxxx>
Betreff: Re: Moving bluestore WAL and DB after
bluestore creation
An: ceph-users@xxxxxxxxxxxxxx
On 16.11.2017 09:45, Loris Cuoghi wrote:
Le Wed, 15 Nov 2017 19:46:48 +0000,
Shawn Edwards <lesser.evil@xxxxxxxxx> a écrit :
On Wed, Nov 15, 2017, 11:07 David Turner <drakonstein@xxxxxxxxx>
wrote:
I'm not going to lie. This makes me dislike Bluestore quite a
bit. Using multiple OSDs to an SSD journal allowed for you to
monitor the write durability of the SSD and replace it without
having to out and re-add all of the OSDs on the device. Having to
now out and backfill back onto the HDDs is awful and would have
made a time when I realized that 20 journal SSDs all ran low on
writes at the same time nearly impossible to recover from.
Flushing journals, replacing SSDs, and bringing it all back online
was a slick process. Formatting the HDDs and backfilling back onto
the same disks sounds like a big regression. A process to migrate
the WAL and DB onto the HDD and then back off to a new device would
be very helpful.
On Wed, Nov 15, 2017 at 10:51 AM Mario Giammarco
<mgiammarco@xxxxxxxxx> wrote:
It seems it is not possible. I recreated the OSD
2017-11-12 17:44 GMT+01:00 Shawn Edwards <lesser.evil@xxxxxxxxx>:
I've created some Bluestore OSD with all data (wal, db, and data)
all on the same rotating disk. I would like to now move the wal
and db onto an nvme disk. Is that possible without re-creating
the OSD?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
This. Exactly this. Not being able to move the .db and .wal data on
and off the main storage disk on Bluestore is a regression.
Hello,
What stops you from dd'ing the DB/WAL's partitions on another disk and
updating the symlinks in the OSD's mount point under /var/lib/ceph/osd?
this probably works when you deployed bluestore with partitions, but
if you did not create partitions for block.db on orginal bluestore
creation there is no block.db symlink, db and wal are mixed into the
block partition and not easy to extract. also just dd the block
device may not help if you want to change the size of the db
partition. this needs more testing. probably tools can be created
in the future for resizing db and wal partitions, and for
extracting db data from block into a separate block.db partition.
dd block.db would probably work when you need to replace a worn out
ssd drive. but not so much if you want to deploy separate block.db
from a bluestore made without block.db
kind regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Eugen Block voice : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail : eblock@xxxxxx
Vorsitzende des Aufsichtsrates: Angelika Mozdzen
Sitz und Registergericht: Hamburg, HRB 90934
Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983
--
Eugen Block voice : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail : eblock@xxxxxx
Vorsitzende des Aufsichtsrates: Angelika Mozdzen
Sitz und Registergericht: Hamburg, HRB 90934
Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com