Re: handling osd removal with ceph-volume?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 26, 2018 at 11:00 AM Jan Fajerski <jfajerski@xxxxxxxx> wrote:
>
> On Fri, Oct 26, 2018 at 08:06:34AM -0400, Alfredo Deza wrote:
> >On Fri, Oct 26, 2018 at 7:11 AM John Spray <jspray@xxxxxxxxxx> wrote:
> >>
> >> On Thu, Oct 25, 2018 at 11:08 PM Noah Watkins <nwatkins@xxxxxxxxxx> wrote:
> >> >
> >> > After speaking with Alfredo and the orchestrator team, it seems there
> >> > are some open questions (well, maybe just questions whose answers need
> >> > to be written down) about OSD removal with ceph-volume.
> >> >
> >> > Feel free to expand the scope of this thread to the many different
> >> > destruction / deactivation scenarios, but we have been driven
> >> > initially by the conversion of one ceph-ansible playbook that removes
> >> > a specific OSD from the cluster that boils down to:
> >> >
> >> >   1. ceph-disk deactivate --deactivate-by-id ID --mark-out
> >> >   2. ceph-disk destroy --destroy-by-id ID --zap
> >> >   3. < manually destroy partitions from `ceph-disk list` >
> >> >
> >> > To accomplish the equivalent without ceph-disk we are doing the following:
> >> >
> >> >   1. ceph osd out ID
> >> >   2. systemctl disable ceph-osd@ID
> >> >   3. systemctl stop ceph-osd@ID
> >> >   4. something equivalent to:
> >> >     | osd_devs = ceph-volume lvm list --format json
> >> >     | for dev in osd_devs[ID]:
> >> >     |    ceph-volume lvm zap dev["path"]
> >> >   5. ceph osd purge ID
> >> >
> >> > This list seems to be complete after examining ceph docs and
> >> > ceph-volume itself. Is there anything missing? Similar questions here:
> >> > http://tracker.ceph.com/issues/22287
> >> >
> >> > Of these steps, the primary question that has popped up is how to
> >> > maintain outside of ceph-volume, the inverse of the systemd unit
> >> > management that ceph-volume takes care of during OSD creation (e.g.
> >> > ceph-osd and ceph-volume units), and whether that inverse operation
> >> > should be a part of ceph-volume itself.
> >>
> >> My suggestion would be to have a separation of the three aspects of
> >> creating/destroying OSDs:
> >>  A) The drive/volume manipulation part (ceph-volume)
> >>  B) Enabling/disabling execution of the ceph-osd process (systemd,
> >> containers, something else...)
> >>  C) The updates to Ceph cluster maps (ceph osd purge, ceph osd destroy etc)
> >>
> >> The thing that ties all three together would live up at the ceph-mgr
> >> layer, where a high level UI (the dashboard and new CLI bits) would
> >> tie it all together.
> >
> >This proposed separation is at odds to what ceph-volume does today.
> >All three happen when provisioning an OSD. Not having some counterpart
> >for deactivation would cause similar confusion as today: why enabling
> >happens in ceph-volume while disabling/deactivation is not there?
> >
> >>
> >> That isn't to exclude having functionality in ceph-volume where it's a
> >> useful convenience (e.g. systemd), but in general ceph-volume can't be
> >> expected to know how to start OSD services in e.g. Kubernetes
> >> environments.
> >
> >The same could be said when provisioning. How does ceph-volume knows
> >how to provision an OSD in kubernetes? It doesn't. What we do there
> >is enable certain functionality that containers can make use of, for
> >example do all the activation but skip the systemd enabling.
> >
> >There are a couple of reasons why 'deactivate' hasn't made it into
> >ceph-volume. One of them is that, it wasn't clear (to me) if
> >deactivation meant full removal/purging of the OSD or if
> >it meant to leave it in a state where it wouldn't start (e.g.
> >disabling the systemd units).
> >
> >My guess is that there is a need for both and for a few more use
> >cases, like disabling the systemd unit so that the same OSD can be
> >provisioned. So far we've concentrated in the creation of OSDs
> >surpassing ceph-disk
> >features, but I think that we can start exploring the complexity of
> >deactivation now.
> Yeah that would be great. I was wondering about lvm management that might relate
> to this. Afaiu (and please correct me if I'm wrong) c-v does some basic lvm
> management when a block device is passed as --data but to get an lv as wal/db
> devices it must be created beforehand.
> Would it make sense to add a dedicated lvm management layer to c-v or was this
> ruled out long ago?

We have! It is now part of the `ceph-volume lvm batch` sub-command
which will create everything for you given
an input of devices.

http://docs.ceph.com/docs/master/ceph-volume/lvm/batch/

> I think this could also have benefits for other operation regarding lv's, like
> renaming and growing an lv (I believe Igor was looking into growing a wal/db lv
> and then growing BlueFS after that).
>
> Best,
> Jan
> >
> >
> >>
> >> John
> >>
> >> > My understanding of the systemd process for ceph is that the
> >> > ceph-volume unit itself activates the corresponding OSD using the
> >> > ceph-osd systemd template--so there isn't any osd-specific unit files
> >> > to clean up when an OSD is removed. That still leaves the question of
> >> > how to properly remove the ceph-volume units if that is indeed the
> >> > process that needs to occur. Glancing over the zap code, it doesn't
> >> > look like zap handles that task. Related tracker here:
> >> > http://tracker.ceph.com/issues/25029
> >> >
> >> > In the ceph docs it seems to only indicate that the OSD needs to be
> >> > stopped, and presumably there are other final clean-up steps?
> >> >
> >> >
> >> > - Noah
> >
>
> --
> Jan Fajerski
> Engineer Enterprise Storage
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux