On Fri, Dec 6, 2019 at 5:59 AM Sebastien Han <shan@xxxxxxxxxx> wrote: > > Hi, > > Following up on my previous ceph-volume email as promised. > > When running Ceph with Rook in Kubernetes in the Cloud (Aws, Azure, > Google, whatever), the OSDs are backed by PVC (Cloud block storage) > attached to virtual machines. > This makes the storage portable if the VM dies, the device will be > attached to a new virtual machine and the OSD will resume running. > > In Rook, we have 2 main deployments for the OSD: > > 1. Prepare the disk to become an OSD > Prepare will run on the VM, attach the block device, run "ceph-volume > prepare", then this gets complicated. After this, the device is > supposed to be detached from the VM because the container terminated. > However, the block is still held by LVM so the VG must be > de-activated. Currently, we do this in Rook, but it would be nice to > de-activate the VG once ceph-volume is done preparing the disk in a > container. > > 2. Activate the OSD. > Now, onto the new container, the device is attached again on the VM. > At this point, more changes will be required in ceph-volume, > particularly in the "activate" call. > a. ceph-volume should activate the VG By VG you mean LVM's Volume Group? > b. ceph-volume should activate the device normally Not "normally" though right? That would imply starting the OSD which you are indicating is not desired. > c. ceph-volume should run the ceph-osd process in foreground as well > as accepting flag to that CLI, we could have something like: > "ceph-volume lvm activate --no-systemd $STORE_FALG $OSD_ID $OSD_UUID > <a bunch of flags>" > Perhaps we need a new flag to indicate we want to run the osd > process in foreground? > Here is an example on how an OSD run today: > > ceph-osd --foreground --id 2 --fsid > 9a531951-50f2-4d48-b012-0aef0febc301 --setuser ceph --setgroup ceph > --crush-location=root=default host=minikube --default-log-to-file > false --ms-learn-addr-from-peer=false > > --> we can have a bunch of flags or an ENV var with all the flags > whatever you prefer. > > This wrapper should watch for signals too, it should reply to > SIGTERM in the following way: > - stop the OSD > - de-activate the VG > - exit 0 > > Just a side note, the VG must be de-activated when the container stops > so that the block device can be detached from the VMs, otherwise, > it'll still be held by LVM. I am worried that this goes beyond what I consider the scope of ceph-volume which is: prepare device(s) to be part of an OSD. Catching signals, handling the OSD in the foreground, and accepting (proxying) flags, sounds problematic for a robust implementation in ceph-volume, even if that means it will help Rook in this case. The other challenge I see is that it seems Ceph is in a transition from being a baremetal project to a container one, except lots of tooling (like ceph-volume) is deeply tied to the non-containerized workflows. This makes it difficult (and non-obvious!) in ceph-volume when adding more flags to do things that help the containerized deployment. To solve the issues you describe, I think you need either a separate command-line tool that can invoke ceph-volume with the added features you listed, or if there is significant push to get more things in ceph-volume, a separate sub-command, so that the `lvm` is isolated from the conflicting logic. My preference would be a wrapper script, separate from the Ceph project. > > Hopefully, I was clear :). > This is just a proposal if you feel like this could be done > differently, feel free to suggest. > > Thanks! > ––––––––– > Sébastien Han > Senior Principal Software Engineer, Storage Architect > > "Always give 100%. Unless you're giving blood." > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx