Re: Upgrade from Octopus to Pacific cannot get monitor to join

kevin@xxxxxxxxxx · Fri, 29 Jul 2022 16:29:46 +0000

Just a quick write up in case anyone else is stuck.

Following suggestions, I made 3 lxc containers and installed ceph-mon on each with octopus. Then I joined them to the existing cluster. Once joined I then changed repo's on one container and then upgraded ceph-mon to pacific. It took a bit of trial and error but eventually I got 3 mon's running pacific and then I was able to use ceph orch to remove and then re-add mons running pacific in containers

It appears ceph orch redeploy does not work on the cluster for some reason. It says the redeploy is scheduled but nothing ever happens. Most likely the root of my problem.

Last but not least, keeping all 5 mon IP's updated correctly in the ceph.conf also seemed key during the process

Now to try and upgrade the osd's and most likely file a ceph orch bug

July 28, 2022 8:03 AM, "Gregory Farnum" <gfarnum@xxxxxxxxxx> wrote:

> On Wed, Jul 27, 2022 at 4:54 PM <kevin@xxxxxxxxxx> wrote:
> 
>> Currently, all of the nodes are running in docker. The only way to upgrade is to redeploy with
>> docker (ceph orch daemon redeploy), which is essentially making a new monitor. Am I missing
>> something?
> 
> Apparently. I don't have any experience with Docker, and unfortunately
> very little with containers in general, so I'm not sure what process
> you need to follow, though. cephadm certainly managers to do it
> properly — you want to maintain the existing disk store.
> 
> How do you do it for OSDs? Surely you don't create throw away an old
> OSD, create a new one, and wait for migration to complete before doing
> the next...
> -Greg
> 
>> Is there some prep work I could/should be doing?
>> 
>> I want to do a staggered upgrade as noted here (https://docs.ceph.com/en/pacific/cephadm/upgrade).
>> That says for a staggered upgrade the order is mgr -> mon, etc. But that was not working for me
>> because it said the --daemon-types was not supported.
>> 
>> Basically I'm confused on what is the 'proper' way to upgrade then. There isn't any way that I see
>> to upgrade the 'code' they are running because it's all in docker containers. But maybe I'm missing
>> something obvious
>> 
>> Thanks
>> 
>> July 27, 2022 4:34 PM, "Gregory Farnum" <gfarnum@xxxxxxxxxx> wrote:
>> 
>> On Wed, Jul 27, 2022 at 10:24 AM <kevin@xxxxxxxxxx> wrote:
>> 
>> Currently running Octopus 15.2.16, trying to upgrade to Pacific using cephadm.
>> 
>> 3 mon nodes running 15.2.16
>> 2 mgr nodes running 16.2.9
>> 15 OSD's running 15.2.16
>> 
>> The mon/mgr nodes are running in lxc containers on Ubuntu running docker from the docker repo (not
>> the Ubuntu repo). Using cephadm to remove one of the monitor nodes, and then re-add it back with a
>> 16.2.9 version. The monitor node runs but never joins the cluster. Also, this causes the other 2
>> mon nodes to start flapping. Also tried adding 2 mon nodes (for a total of 5 mons) on bare metal
>> running Ubuntu (with docker running from the docker repo) and the mon's won't join and won't even
>> show up in 'ceph status'
>> 
>> The way you’re phrasing this it sounds like you’re removing existing monitors and adding
>> newly-created ones. That won’t work across major version boundaries like this (at least, without a
>> bit of prep work you aren’t doing) because of how monitors bootstrap themselves and their cluster
>> membership. You need to upgrade the code running on the existing monitors instead, which is the
>> documented upgrade process AFAIK.
>> -Greg
>> 
>> Can't find anything in the logs regarding why it's failing. The docker container starts and seems
>> to try to join the cluster but just sits and doesn't join. The other two start flapping and then
>> eventually I have to stop the new mon. I can add the monitor back by changing the container_image
>> to 15.2.16 and it will re-join the cluster as expected.
>> 
>> The cluster was previously running nautilus installed using ceph-deploy
>> 
>> Tried setting 'mon_mds_skip_sanity true' from reading another post but it doesn't appear to help.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx