Re: Issues upgrading cephadm cluster from Octopus.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't know for sure if it will fix the issue, but the migrations happen
based on a config option "mgr/cephadm/migration_current". You could try
setting that back to 0 and it would at least trigger the migrations to
happen again after restarting/failing over the mgr. They're meant to be
idempotent so in the worst case it just won't accomplish anything. Also,
you're correct about it not being in the docs. The migrations were intended
to be internal and never require user actions but it appears something has
gone wrong in this case.

On Fri, Nov 18, 2022 at 3:06 PM Seth T Graham <sether@xxxxxxxx> wrote:

> We have a cluster running Octopus (15.2.17) that I need to get updated and
> am getting cephadm failures when updating the managers, and have tried both
> Pacific and Quincy with the same results. The cluster was deployed with
> cephadm on centos stream 8 using podman and due to network isolation of the
> cluster the images are being pulled from a private registry. When I issue
> the 'ceph orch upgrade' command it starts out well by updating two of the
> three managers. When it gets to the point of transitioning to one of the
> upgraded managers the process stops with an error, with 'ceph status'
> reporting that the cephadm module has failed.
>
> Digging through the logs, I find a python stack trace that reads:
>
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 587, in serve
>     serve.serve()
>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 67, in serve
>     self.convert_tags_to_repo_digest()
>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 974, in
> convert_tags_to_repo_digest
>     self._get_container_image_info(container_image_ref))
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 590, in wait_async
>     return self.event_loop.get_result(coro)
>   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 48, in get_result
>     return asyncio.run_coroutine_threadsafe(coro, self._loop).result()
>   File "/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
>     return self.__get_result()
>   File "/lib64/python3.6/concurrent/futures/_base.py", line 384, in
> __get_result
>     raise self._exception
>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1374, in
> _get_container_image_info
>     await self._registry_login(host,
> json.loads(str(self.mgr.get_store('registry_credentials'))))
>   File "/lib64/python3.6/json/__init__.py", line 354, in loads
>     return _default_decoder.decode(s)
>   File "/lib64/python3.6/json/decoder.py", line 339, in decode
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/lib64/python3.6/json/decoder.py", line 357, in raw_decode
>     raise JSONDecodeError("Expecting value", s, err.value) from None
> json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
>
>
> Looking through the ceph config there is indeed no setting for the
> 'registry_credentials' value. Instead I have the registry_password,
> registry_url and registry_username values that were set when the cluster
> was provisioned.
>
> I do find mention of this key in the migrations.py script (lives in
> /usr/share/ceph/mgr/cephadm), under the function 'migrate_4_5' which reads
> to me like the old keys have been retired in favor of a unified key
> containing a json object. So I attempted to recreate what that function is
> doing by setting that key manually but unfortunately this didn't help.
>
> (eg, 'ceph config set mgr mgr/cephadm/registry_credentials '{ "url":
> "XXX", "username": "XXX", "password": "XXX" }'')
>
> I'm not sure where to go from here. Is there a 'migrate' option I can
> specify somewhere to properly upgrade this cluster, and perhaps run the
> code found in migrations.py? I don't see any mention of this in the
> documentation, but there's a lot of documentation so it's possible I missed
> it.
>
> Failing that, are there any suggestions for a workaround so I can get this
> upgrade completed?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux