Octopus: conversion from ceph-ansible to Cephadm causes unexpected 15.2.15→.13 downgrade for MDSs and RGWs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello everyone,

my colleagues and I just ran into an interesting situation updating our Ceph training course. That course's labs cover deploying a Nautilus cluster with ceph-ansible, upgrading it to Octopus (also with ceph-ansible), and then converting it to Cephadm before proceeding with the upgrade to Pacific.

When freshly upgraded to Octopus with ceph-ansible, the entire cluster is at version 15.2.15. And everything that is then being adopted into Cephadm management (with "cephadm adopt --style legacy") gets containers running that release. So far, so good.

When we've completed the adoption process for MGRs, MONs, and OSDs, we proceed to redeploying our MDSs and RGWs, using "ceph orch apply mds" and "ceph orch apply rgw". Here, what we end up with is a bunch of MDSs and RGWs running on 15.2.13. Since the cluster previously ran Ansible-deployed 15.2.15 MDSs and RGWs, that makes this a partial (and very unexpected) downgrade.

The docs at https://docs.ceph.com/en/octopus/cephadm/adoption/ do state that we can use "cephadm --image <image>" to set the image. But we don't actually need that when we invoke cephadm directly ("cephadm adopt" does pull the correct image). Rather we'd need to set the correct image for deployment by "ceph orch apply", and there doesn't seem to be a straightforward way to do that.

I suppose that this can be worked around in a couple of ways:

* by following the documentation and then running "ceph orch upgrade start --ceph-version 15.2.15" immediately after; * by running "ceph orch daemon redeploy", which does support an --image parameter (but is per-daemon, thus less convenient than running through a rolling update).

But I'd argue that none of those additional steps should actually be necessary — rather, "ceph orch apply" should just deploy the correct (latest) version without additional user involvement.

The documentation seems to suggest another approach, namely to use an updated service spec, but unfortunately that won't work as we can't set "image" that way. Example for the rgw service:

---
# rgw.yml
service_type: rgw
service_id: default.default
placement:
  count: 3
image: "quay.io/ceph/ceph:v15"
ports:
  - 7480

# ceph orch apply -i rgw.yaml
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 'image'

So, we're curious what's the correct way to ensure that "ceph orch apply" installs the latest Octopus release for MDSs and RGWs being redeployed as part of a Cephadm cluster conversion. Or is this simply a bug somewhere in the orchestrator that would need fixing?

Cheers,
Florian

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux