Hi Igor,
The immediate answer is to use "ceph-volume lvm zap" on the db LV after
running the migrate. But for the longer term I think the "lvm zap" should
be included in the "lvm migrate" process.
I.e. this works to migrate a separate wal/db to the block device:
#
# WARNING! DO NOT ZAP AFTER STARTING THE OSD!!
#
$ cephadm ceph-volume lvm list "${osd}" > ~/"osd.${osd}.list"
$ systemctl stop "${osd_service}"
$ cephadm shell --fsid "${fsid}" --name "osd.${osd}" -- \
ceph-volume lvm migrate --osd-id "${osd}" --osd-fsid "${osd_fsid}" \
--from db wal --target "${vg_lv}"
$ cephadm shell --fsid "${fsid}" --name "osd.${osd}" -- \
ceph-volume lvm zap "${db_lv}"
$ systemctl start "${osd_service}"
WARNING! If you don't do the zap before starting the osd, the osd will be
running with the db still on the LV. If you then stop the osd and zap the
LV and start the osd again, you'll be running on the original db as it was
copied to the block device before the migrate, which will be missing any
updates done in the meantime. I don't know what problems that might cause.
In this situation I've restored the LV tags (i.e. all tags on the db LV,
the db_device and db_uuid tags on the block LV) using the info from
~/osd.${osd}.list (otherwise the migrate fails!) and then gone through the
migrate process again.
The problem is, it turns out the osd is being activated as a "raw" device
rather than an "lvm" device, and the "raw" db device (which is actually an
lvm LV) still has a bluestore label on it after the migrate, so it's still
seen as a component of the osd.
E.g. before the migrate, both of these show the osd with the separate db:
$ cephadm ceph-volume lvm list
$ cephadm ceph-volume raw list
After the migrate (without zap), the "lvm list" does NOT show the separate
db (because the appropriate LV tags have been removed), but the "raw list"
still shows the osd with the separate db.
And the osd is being activated as a "raw" device, both before and after
the migrate. E.g. extract from the journal before the migrate:
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-25 --no-mon-config --dev /dev/mapper/ceph--5ccbb386--142b--4bf7--
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d2
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/ln -s /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d21 /var/lib/ce
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-2
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/ln -s /dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23 /var/lib/ceph/
Nov 15 22:39:05 k12 bash[3829222]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 15 22:39:05 k12 bash[3829222]: --> ceph-volume raw activate successful for osd ID: 25
After a migrate without a zap - note there are still two mapper/lv devices
found, which includes the now-unwanted db LV:
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-25 --no-mon-config --dev /dev/mapper/ceph--5ccbb386--142b--4bf7--
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d2
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/ln -s /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d21 /var/lib/ce
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-2
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/ln -s /dev/mapper/ceph--d4b1e932--4557--4b88--bed2--9305a07e76eb-osd--db--6a507f57--884c--4947--a147--cd50f98f1a23 /var/lib/ceph/
Nov 16 09:08:31 k12 bash[4012506]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 16 09:08:31 k12 bash[4012506]: --> ceph-volume raw activate successful for osd ID: 25
After a migrate and zap - note there's now only a single mapper/lv device
found, i.e. we've successfully stopped using the separate db device:
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-25 --no-mon-config --dev /dev/mapper/ceph--5ccbb386--142b--4bf7--
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d2
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/ln -s /dev/mapper/ceph--5ccbb386--142b--4bf7--a180--04bcf9a1f61b-osd--block--7710024b--ec71--4fd3--b94c--c4c4b9af2d21 /var/lib/ce
Nov 16 12:33:39 k12 bash[4091471]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25
Nov 16 12:33:39 k12 bash[4091471]: --> ceph-volume raw activate successful for osd ID: 25
Wrapping up...
I think the "lvm zap" should be included in the "ceph-volume lvm migrate"
process, and perhaps "ceph-volume activate" changed to NOT detect LVs as
raw devices so they're correctly activated as "lvm" devices.
Another oddity that unfortunately extended the time taken to analyse this
issue... why does "ceph-volume raw list ${osd}" NOT show lvm osds, when
plain "ceph-volume raw list" shows them?
Cheers,
Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx