Hello, After a successful upgrade of a Ceph cluster from 16.2.7 to 16.2.11, I needed to downgrade it back to 16.2.7 as I found an issue with the new version. I expected that running the downgrade with:`ceph orch upgrade start --ceph-version 16.2.7` should have worked fine. However, it blocked right after the downgrade of the first MGR daemon. In fact, the downgraded daemon is not able to use the cephadm module anymore. Any `ceph orch` command fails with the following error: ``` $ ceph orch ps Error ENOENT: Module not found ``` And the downgrade process is therefore blocked. These are the logs of the MGR when issuing the command: ``` Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.557+0000 7f828fe8c700 0 log_channel(audit) log [DBG] : from='client.3136173 -' entity='client.admin' cmd=[{"prefix": "orch ps", "target": ["mon-mgr", ""]}]: dispatch Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG root] _oremote orchestrator -> cephadm.list_daemons(*(None, None), **{'daemon_id': None, 'host': None, 'refresh': False}) Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm' Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG root] _oremote orchestrator -> cephadm.get_feature_set(*(), **{}) Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm' Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 mgr.server reply reply (2) No such file or directory Module not found ``` Other interesting MGR logs are: ``` 2023-03-28T11:05:59.519+0000 7fcd16314700 4 mgr get_store get_store key: mgr/cephadm/upgrade_state 2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Failed to construct class in 'cephadm' 2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Traceback (most recent call last): e "/usr/share/ceph/mgr/cephadm/module.py", line 450, in __init__ elf.upgrade = CephadmUpgrade(self) e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 111, in __init__ elf.upgrade_state: Optional[UpgradeState] = UpgradeState.from_json(json.loads(t)) e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 92, in from_json eturn cls(**c) rror: __init__() got an unexpected keyword argument 'daemon_types' 2023-03-28T11:05:59.521+0000 7fcd16314700 -1 mgr operator() Failed to run module in active mode ('cephadm') ``` Which seem to relate to the new feature of staggered upgrades. Please note that before, everything was working fine with version 16.2.7. I am currently stuck in this situation with only one MGR daemon on version 16.2.11 which is the only one still working fine: ``` [root@astano01 ~]# ceph orch ps | grep mgr mgr.astano02.mzmewn astano02 *:8443,9283 running (5d) 43s ago 2y 455M - 16.2.11 7a63bce27215 e2d7806acf16 mgr.astano03.qtzccn astano03 *:8443,9283 running (3m) 22s ago 95m 383M - 16.2.7 463ec4b1fdc0 cc0d88864fa1 ``` Does anyone already faced this issue or knows how can I make the 16.2.7 MGR load the cephadm module correctly? Thanks in advance for any help! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx