Hi,
On 2/1/19 11:40 AM, Stuart Longland wrote:
Hi all,
I'm just in the process of migrating my 3-node Ceph cluster from
BTRFS-backed Filestore over to Bluestore.
Last weekend I did this with my first node, and while the migration went
fine, I noted that the OSD did not survive a reboot test: after
rebooting /var/lib/ceph/osd/ceph-0 was completely empty and
/etc/init.d/ceph-osd.0 (I run OpenRC init on Gentoo) would refuse to start.
https://stuartl.longlandclan.id.au/blog/2019/01/28/solar-cluster-adventures-in-ceph-migration/
I managed to recover it, but tonight I'm trying with my second node.
I've provisioned a temporary OSD (plugged in via USB3) for it to migrate
to using BlueStore. The ceph cluster called it osd.4.
One thing I note is that `ceph-volume` seems to have created a `tmpfs`
mount for the new OSD:
tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
Admittedly this is just a temporary OSD, tomorrow I'll be blowing away
the *real* OSD on this node (osd.1) and provisioning it again using
BlueStore.
I really don't want the ohh crap moment I had on Monday afternoon (as
one does on the Australia Day long weekend) frantically digging through
man pages and having to do the `ceph-bluestore-tool prime-osd-dir` dance.
I think mounting tmpfs for something that should be persistent is highly
dangerous. Is there some flag I should be using when creating the
BlueStore OSD to avoid that issue?
The tmpfs setup is expected. All persistent data for bluestore OSDs
setup with LVM are stored in LVM metadata. The LVM/udev handler for
bluestore volumes create these tmpfs filesystems on the fly and populate
them with the information from the metadata.
All our ceph nodes do not have any persistent data in /var/lib/ceph/osd
anymore:
root@bcf-01:~# mount
...
/dev/sdm1 on /boot type ext4 (rw,relatime,data=ordered)
tmpfs on /var/lib/ceph/osd/ceph-125 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-128 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-130 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-3 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-1 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-2 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-129 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-5 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-127 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-131 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-6 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-126 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-124 type tmpfs (rw,relatime)
....
This works fine on machines using systemd. If your setup does not
support this, you might want to use the 'simple' ceph-volume mode
instead of the 'lvm' one. AFAIK it uses the gpt partition type method
that has been around for years.
Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com