It might be worth trying to manually upgrade one of the mgr daemons. If you go to the host with a mgr and edit the /var/lib/ceph/<fsid>/<mgr-daemon-name>/unit.run so that the image specified in the long podman/docker run command in there is the 17.2.7 image. Then just restart its systemd unit (don't tell the orchestrator to do the restart of the mgr. That can cause your change to the unit.run fiel to be overwritten). If you only have two mgr daemons you should be able to use failovers to make that one the active mgr at which point the active mgr will have the patch that fixes this issue and you should be able to get the upgrade going. `ceph orch daemon redeploy <mgr-daemon-name> --image <17.2.7 image> might also work, but I tend to find the manual steps are more reliable for this sort of issue as you don't have to worry about issues within the orchestrator causing that operation to fail. On Tue, Aug 6, 2024 at 7:26 PM Magnus Larsen <magnusfynbo@xxxxxxxxxxx> wrote: > Hi Ceph-users! > > Ceph version: ceph version 17.2.6 > (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) > Using cephadm to orchestrate the Ceph cluster > > I’m running into https://tracker.ceph.com/issues/59189, which is fixed in > next version—quincy 17.2.7—via > https://github.com/ceph/ceph/pull/50906 > > But I am unable to upgrade to the fixed version because of that bug > > When I try to upgrade (using “ceph orch upgrade start –image > internal_mirror/ceph:v17.2.7”), we see the same error message: > executing _write_files((['dkcphhpcadmin01', 'dkcphhpcmgt028', > 'dkcphhpcmgt029', 'dkcphhpcmgt031', 'dkcphhpcosd033', 'dkcphhpcosd034', > 'dkcphhpcosd035', 'dkcphhpcosd036', 'dkcphhpcosd037', 'dkcphhpcosd038', > 'dkcphhpcosd039', 'dkcphhpcosd040', 'dkcphhpcosd041', 'dkcphhpcosd042', > 'dkcphhpcosd043', 'dkcphhpcosd044'],)) failed. Traceback (most recent call > last): File "/usr/share/ceph/mgr/cephadm/ssh.py", line 240, in > _write_remote_file conn = await self._remote_connection(host, addr) File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp await > source.run(srcpath) File "/lib/python3.6/site-packages/asyncssh/scp.py", > line 458, in run self.handle_error(exc) File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in handle_error > raise exc from None File "/lib/python3.6/site-packages/asyncssh/scp.py", > line 456, in run await self._send_files(path, b'') File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in _send_files > self.handle_error(exc) File "/lib/python3.6/site-packages/asyncssh/scp.py", > line 307, in handle_error raise exc from None File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in _send_files > await self._send_file(srcpath, dstpath, attrs) File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in _send_file > await self._make_cd_request(b'C', attrs, size, srcpath) File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in > _make_cd_request self._fs.basename(path)) File > "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in make_request > raise exc asyncssh.sftp.SFTPFailure: scp: /tmp/etc/ceph/ceph.conf.new: > Permission denied During handling of the above exception, another exception > occurred: Traceback (most recent call last): File > "/usr/share/ceph/mgr/cephadm/utils.py", line 79, in do_work return f(*arg) > File "/usr/share/ceph/mgr/cephadm/serve.py", line 1088, in _write_files > self._write_client_files(client_files, host) File > "/usr/share/ceph/mgr/cephadm/serve.py", line 1107, in _write_client_files > self.mgr.ssh.write_remote_file(host, path, content, mode, uid, gid) File > "/usr/share/ceph/mgr/cephadm/ssh.py", line 261, in write_remote_file > self.mgr.wait_async(self._write_remote_file( File > "/usr/share/ceph/mgr/cephadm/module.py", line 615, in wait_async return > self.event_loop.get_result(coro) File "/usr/share/ceph/mgr/cephadm/ssh.py", > line 56, in get_result return asyncio.run_coroutine_threadsafe(coro, > self._loop).result() File "/lib64/python3.6/concurrent/futures/_base.py", > line 432, in result return self.__get_result() File > "/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result > raise self._exception File "/usr/share/ceph/mgr/cephadm/ssh.py", line 249, > in _write_remote_file logger.exception(msg) > orchestrator._interface.OrchestratorError: Unable to write > dkcphhpcmgt028:/etc/ceph/ceph.conf: scp: /tmp/etc/ceph/ceph.conf.new: > Permission denied > > We were thinking about removing the keyring from the Ceph orchestrator ( > https://docs.ceph.com/en/latest/cephadm/operations/#putting-a-keyring-under-management > ), > which would then make Ceph not try to copy over a new ceph.conf, > alleviating the problem ( > https://docs.ceph.com/en/latest/cephadm/operations/#client-keyrings-and-configs > ), > but in doing so, Ceph will kindly remove the key from all nodes ( > https://docs.ceph.com/en/latest/cephadm/operations/#disabling-management-of-a-keyring-file > ) > leaving us without the admin keyring. So that doesn’t sound like a path we > want to take :S > > Does anybody know how to get around this issue, so I can get to version > where the issue fixed for good? > > Thanks, > Magnus > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx