Many thanks for your replies! Am 21.02.2018 um 02:20 schrieb Alfredo Deza: > On Tue, Feb 20, 2018 at 5:56 PM, Oliver Freyermuth > <freyermuth@xxxxxxxxxxxxxxxxxx> wrote: >> Dear Cephalopodians, >> >> with the release of ceph-deploy we are thinking about migrating our Bluestore-OSDs (currently created with ceph-disk via old ceph-deploy) >> to be created via ceph-volume (with LVM). > > When you say migrating, do you mean creating them again from scratch > or making ceph-volume take over the previously created OSDs > (ceph-volume can do both) I would recreate from scratch to switch to LVM, we have a k=4 m=2 EC-pool with 6 hosts, so I can just take down a full host and recreate. But good to know both would work! > >> >> I note two major changes: >> 1. It seems the block.db partitions have to be created beforehand, manually. >> With ceph-disk, one should not do that - or manually set the correct PARTTYPE ID. >> Will ceph-volume take care of setting the PARTTYPE on existing partitions for block.db now? >> Is it not necessary anymore? >> Is the config option bluestore_block_db_size now also obsoleted? > > Right, ceph-volume will not create any partitions for you, so no, it > will not take care of setting PARTTYPE either. If your setup requires > a block.db, then this must be > created beforehand and then passed onto ceph-volume. The one > requirement if it is a partition is to have a PARTUUID. For logical > volumes it can just work as-is. This is > explained in detail at > http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#bluestore > > PARTUUID information for ceph-volume at: > http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#partitioning Ok. So do I understand correctly that the PARTTYPE setting (i.e. those magic numbers as found e.g. in ceph-disk sources in PTYPE: https://github.com/ceph/ceph/blob/master/src/ceph-disk/ceph_disk/main.py#L62 ) is not needed anymore for the block.db partitions, since it was effectively only there to have udev work? I remember from ceph-disk that if I created the block.db partition beforehand and without setting the magic PARTTYPE, it would become unhappy. ceph-volume and the systemd activation path should not care at all if I understand this correctly. So in short, to create a new OSD, steps for me would be: - Create block.db partition (and don't care about PARTTYPE). I do only have to make sure it has a PARTUUID. - ceph-volume lvm create --bluestore --block.db /dev/sdag1 --data /dev/sda (or the same via ceph-deploy) >> >> 2. Activation does not work via udev anymore, which solves some racy things. >> >> This second major change makes me curious: How does activation work now? >> In the past, I could reinstall the full OS, install ceph packages, trigger udev / reboot and all OSDs would come back, >> without storing any state / activating any services in the OS. > > Activation works via systemd. This is explained in detail here > http://docs.ceph.com/docs/master/ceph-volume/lvm/activate > > Nothing with `ceph-volume lvm` requires udev for discovery. If you > need to re-install the OS and recover your OSDs all you need to do is > to > re-activate them. You would need to know what the ID and UUID of the OSDs is. > > If you don't have that information handy, you can run: > > ceph-volume lvm list > > And all the information will be available. This will persist even on > system re-installs Understood - so indeed the manual step would be to run "list" and then activate the OSDs one-by-one to re-create the service files. More cumbersome than letting udev do it's thing, but it certainly gives more control, so it seems preferrable. Are there plans to have something like "ceph-volume discover-and-activate" which would effectively do something like: ceph-volume list and activate all OSDs which are re-discovered from LVM metadata? This would largely simplify OS reinstalls (otherwise I'll likely write a small shell script to do exactly that), and as far as I understand, activating an already activated OSD should be harmless (it should only re-enable an already enabled service file). > >> >> Does this still work? >> Or is there a manual step needed to restore the ceph-osd@ID-UUID services which at first glance appear to store state (namely ID and UUID)? > > The manual step would be to call activate as described here > http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#new-osds >> >> If that's the case: >> - What is this magic manual step? > > Linked above > >> - Is it still possible to flip two disks within the same OSD host without issues? > > What do you mean by "flip" ? Sorry, I was unclear on this. I meant exchanging two harddrives with each other within a single OSD host, e.g. /dev/sda => /dev/sdc and /dev/sdc => /dev/sda (for controller weirdness or whatever reason). If I understand correctly, this should not be a problem at all, since OSD-ID and PARTUUID are unaffected by that (as you write, LVM metadata will persist with the device). Many thanks again for this very exensive reply! > >> I would guess so, since the services would detect the disk in the ceph-volume trigger phase. >> - Is it still possible to take a disk from one OSD host, and put it in another one, or does this now need a manual interaction? >> With ceph-disk / udev, it did not, since udev triggered disk activation and then the service was created at runtime. > > It is technically possible, the lvm part of it was built with this in > mind. The LVM metadata will persist with the device, so this is not a > problem. Just manual activation would be needed. > >> >> Many thanks for your help and cheers, >> Oliver >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com