for the specific issue with that traceback, you can probably resolve that by removing the stored upgrade state. We put it at `mgr/cephadm/upgrade_state` I believe (can check "ceph config-key ls" and look for something related to upgrade state if that doesn't work) so running "ceph config-key rm mgr/cephadm/upgrade_state" should remove the old one. Then I'd say manually downgrade the mgr daemons to avoid this happening again (process is roughly the same as https://docs.ceph.com/en/quincy/cephadm/upgrade/#upgrading-to-a-version-that-supports-staggered-upgrade-from-one-that-doesn-t) and at that point you should be able to try using an upgrade command again. On Thu, Mar 30, 2023 at 11:07 AM <elia.oggian@xxxxxxx> wrote: > Hello, > After a successful upgrade of a Ceph cluster from 16.2.7 to 16.2.11, I > needed to downgrade it back to 16.2.7 as I found an issue with the new > version. > > I expected that running the downgrade with:`ceph orch upgrade start > --ceph-version 16.2.7` should have worked fine. However, it blocked right > after the downgrade of the first MGR daemon. In fact, the downgraded daemon > is not able to use the cephadm module anymore. Any `ceph orch` command > fails with the following error: > > ``` > $ ceph orch ps > Error ENOENT: Module not found > ``` > And the downgrade process is therefore blocked. > > These are the logs of the MGR when issuing the command: > > ``` > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.557+0000 7f828fe8c700 0 log_channel(audit) log > [DBG] : from='client.3136173 -' entity='client.admin' cmd=[{"prefix": "orch > ps", "target": ["mon-mgr", ""]}]: dispatch > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG > root] _oremote orchestrator -> cephadm.list_daemons(*(None, None), > **{'daemon_id': None, 'host': None, 'refresh': False}) > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm' > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG > root] _oremote orchestrator -> cephadm.get_feature_set(*(), **{}) > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm' > Mar 28 12:13:15 astano03 > ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: > debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 mgr.server reply reply > (2) No such file or directory Module not found > ``` > > Other interesting MGR logs are: > ``` > 2023-03-28T11:05:59.519+0000 7fcd16314700 4 mgr get_store get_store key: > mgr/cephadm/upgrade_state > 2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Failed to construct > class in 'cephadm' > 2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Traceback (most > recent call last): > e "/usr/share/ceph/mgr/cephadm/module.py", line 450, in __init__ > elf.upgrade = CephadmUpgrade(self) > e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 111, in __init__ > elf.upgrade_state: Optional[UpgradeState] = > UpgradeState.from_json(json.loads(t)) > e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 92, in from_json > eturn cls(**c) > rror: __init__() got an unexpected keyword argument 'daemon_types' > > 2023-03-28T11:05:59.521+0000 7fcd16314700 -1 mgr operator() Failed to run > module in active mode ('cephadm') > ``` > Which seem to relate to the new feature of staggered upgrades. > > Please note that before, everything was working fine with version 16.2.7. > > I am currently stuck in this situation with only one MGR daemon on version > 16.2.11 which is the only one still working fine: > > ``` > [root@astano01 ~]# ceph orch ps | grep mgr > mgr.astano02.mzmewn astano02 *:8443,9283 running > (5d) 43s ago 2y 455M - 16.2.11 7a63bce27215 e2d7806acf16 > mgr.astano03.qtzccn astano03 *:8443,9283 running > (3m) 22s ago 95m 383M - 16.2.7 463ec4b1fdc0 cc0d88864fa1 > ``` > > Does anyone already faced this issue or knows how can I make the 16.2.7 > MGR load the cephadm module correctly? > > Thanks in advance for any help! > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx