Re: [RFE] ceph-volume prepare and activate enhancements for containers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 6, 2019 at 5:59 AM Sebastien Han <shan@xxxxxxxxxx> wrote:
>
> Hi,
>
> Following up on my previous ceph-volume email as promised.
>
> When running Ceph with Rook in Kubernetes in the Cloud (Aws, Azure,
> Google, whatever), the OSDs are backed by PVC (Cloud block storage)
> attached to virtual machines.
> This makes the storage portable if the VM dies, the device will be
> attached to a new virtual machine and the OSD will resume running.
>
> In Rook, we have 2 main deployments for the OSD:
>
> 1. Prepare the disk to become an OSD
> Prepare will run on the VM, attach the block device, run "ceph-volume
> prepare", then this gets complicated. After this, the device is
> supposed to be detached from the VM because the container terminated.
> However, the block is still held by LVM so the VG must be
> de-activated. Currently, we do this in Rook, but it would be nice to
> de-activate the VG once ceph-volume is done preparing the disk in a
> container.
>
> 2. Activate the OSD.
> Now, onto the new container, the device is attached again on the VM.
> At this point, more changes will be required in ceph-volume,
> particularly in the "activate" call.
>   a. ceph-volume should activate the VG

By VG you mean LVM's Volume Group?

>   b. ceph-volume should activate the device normally

Not "normally" though right? That would imply starting the OSD which
you are indicating is not desired.

>   c. ceph-volume should run the ceph-osd process in foreground as well
> as accepting flag to that CLI, we could have something like:
> "ceph-volume lvm activate --no-systemd $STORE_FALG $OSD_ID $OSD_UUID
> <a bunch of flags>"
>   Perhaps we need a new flag to indicate we want to run the osd
> process in foreground?
>   Here is an example on how an OSD run today:
>
>   ceph-osd --foreground --id 2 --fsid
> 9a531951-50f2-4d48-b012-0aef0febc301 --setuser ceph --setgroup ceph
> --crush-location=root=default host=minikube --default-log-to-file
> false --ms-learn-addr-from-peer=false
>
>   --> we can have a bunch of flags or an ENV var with all the flags
> whatever you prefer.
>
>   This wrapper should watch for signals too, it should reply to
> SIGTERM in the following way:
>     - stop the OSD
>     - de-activate the VG
>     - exit 0
>
> Just a side note, the VG must be de-activated when the container stops
> so that the block device can be detached from the VMs, otherwise,
> it'll still be held by LVM.

I am worried that this goes beyond what I consider the scope of
ceph-volume which is: prepare device(s) to be part of an OSD.

Catching signals, handling the OSD in the foreground, and accepting
(proxying) flags, sounds problematic for a robust implementation in
ceph-volume, even
if that means it will help Rook in this case.

The other challenge I see is that it seems Ceph is in a transition
from being a baremetal project to a container one, except lots of
tooling (like ceph-volume) is deeply
tied to the non-containerized workflows. This makes it difficult (and
non-obvious!) in ceph-volume when adding more flags to do things that
help the containerized
deployment.

To solve the issues you describe, I think you need either a separate
command-line tool that can invoke ceph-volume with the added features
you listed, or
if there is significant push to get more things in ceph-volume, a
separate sub-command, so that the `lvm` is isolated from the
conflicting logic.

My preference would be a wrapper script, separate from the Ceph project.

>
> Hopefully, I was clear :).
> This is just a proposal if you feel like this could be done
> differently, feel free to suggest.
>
> Thanks!
> –––––––––
> Sébastien Han
> Senior Principal Software Engineer, Storage Architect
>
> "Always give 100%. Unless you're giving blood."
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux