Re: handling osd removal with ceph-volume?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 26, 2018 at 7:11 AM John Spray <jspray@xxxxxxxxxx> wrote:
>
> On Thu, Oct 25, 2018 at 11:08 PM Noah Watkins <nwatkins@xxxxxxxxxx> wrote:
> >
> > After speaking with Alfredo and the orchestrator team, it seems there
> > are some open questions (well, maybe just questions whose answers need
> > to be written down) about OSD removal with ceph-volume.
> >
> > Feel free to expand the scope of this thread to the many different
> > destruction / deactivation scenarios, but we have been driven
> > initially by the conversion of one ceph-ansible playbook that removes
> > a specific OSD from the cluster that boils down to:
> >
> >   1. ceph-disk deactivate --deactivate-by-id ID --mark-out
> >   2. ceph-disk destroy --destroy-by-id ID --zap
> >   3. < manually destroy partitions from `ceph-disk list` >
> >
> > To accomplish the equivalent without ceph-disk we are doing the following:
> >
> >   1. ceph osd out ID
> >   2. systemctl disable ceph-osd@ID
> >   3. systemctl stop ceph-osd@ID
> >   4. something equivalent to:
> >     | osd_devs = ceph-volume lvm list --format json
> >     | for dev in osd_devs[ID]:
> >     |    ceph-volume lvm zap dev["path"]
> >   5. ceph osd purge ID
> >
> > This list seems to be complete after examining ceph docs and
> > ceph-volume itself. Is there anything missing? Similar questions here:
> > http://tracker.ceph.com/issues/22287
> >
> > Of these steps, the primary question that has popped up is how to
> > maintain outside of ceph-volume, the inverse of the systemd unit
> > management that ceph-volume takes care of during OSD creation (e.g.
> > ceph-osd and ceph-volume units), and whether that inverse operation
> > should be a part of ceph-volume itself.
>
> My suggestion would be to have a separation of the three aspects of
> creating/destroying OSDs:
>  A) The drive/volume manipulation part (ceph-volume)
>  B) Enabling/disabling execution of the ceph-osd process (systemd,
> containers, something else...)
>  C) The updates to Ceph cluster maps (ceph osd purge, ceph osd destroy etc)
>
> The thing that ties all three together would live up at the ceph-mgr
> layer, where a high level UI (the dashboard and new CLI bits) would
> tie it all together.

This proposed separation is at odds to what ceph-volume does today.
All three happen when provisioning an OSD. Not having some counterpart
for deactivation would cause similar confusion as today: why enabling
happens in ceph-volume while disabling/deactivation is not there?

>
> That isn't to exclude having functionality in ceph-volume where it's a
> useful convenience (e.g. systemd), but in general ceph-volume can't be
> expected to know how to start OSD services in e.g. Kubernetes
> environments.

The same could be said when provisioning. How does ceph-volume knows
how to provision an OSD in kubernetes? It doesn't. What we do there
is enable certain functionality that containers can make use of, for
example do all the activation but skip the systemd enabling.

There are a couple of reasons why 'deactivate' hasn't made it into
ceph-volume. One of them is that, it wasn't clear (to me) if
deactivation meant full removal/purging of the OSD or if
it meant to leave it in a state where it wouldn't start (e.g.
disabling the systemd units).

My guess is that there is a need for both and for a few more use
cases, like disabling the systemd unit so that the same OSD can be
provisioned. So far we've concentrated in the creation of OSDs
surpassing ceph-disk
features, but I think that we can start exploring the complexity of
deactivation now.


>
> John
>
> > My understanding of the systemd process for ceph is that the
> > ceph-volume unit itself activates the corresponding OSD using the
> > ceph-osd systemd template--so there isn't any osd-specific unit files
> > to clean up when an OSD is removed. That still leaves the question of
> > how to properly remove the ceph-volume units if that is indeed the
> > process that needs to occur. Glancing over the zap code, it doesn't
> > look like zap handles that task. Related tracker here:
> > http://tracker.ceph.com/issues/25029
> >
> > In the ceph docs it seems to only indicate that the OSD needs to be
> > stopped, and presumably there are other final clean-up steps?
> >
> >
> > - Noah



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux