Hi all! > It might be worth trying to manually upgrade one of the mgr daemons Yep this did it! Thank you! > And are any of the hosts shown as offline in the 'ceph orch host ls' output? > Is this the first upgrade you're attempting or did previous upgrades > work with the current config? All host were offline, and I'm not sure if upgrade has worked or not—that was before my time sadly ________________________________ Fra: Adam King <adking@xxxxxxxxxx> Sendt: 7. august 2024 17:45 Til: Magnus Larsen <magnusfynbo@xxxxxxxxxxx> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Emne: Re: Cephadm: unable to copy ceph.conf.new It might be worth trying to manually upgrade one of the mgr daemons. If you go to the host with a mgr and edit the /var/lib/ceph/<fsid>/<mgr-daemon-name>/unit.run so that the image specified in the long podman/docker run command in there is the 17.2.7 image. Then just restart its systemd unit (don't tell the orchestrator to do the restart of the mgr. That can cause your change to the unit.run fiel to be overwritten). If you only have two mgr daemons you should be able to use failovers to make that one the active mgr at which point the active mgr will have the patch that fixes this issue and you should be able to get the upgrade going. `ceph orch daemon redeploy <mgr-daemon-name> --image <17.2.7 image> might also work, but I tend to find the manual steps are more reliable for this sort of issue as you don't have to worry about issues within the orchestrator causing that operation to fail. On Tue, Aug 6, 2024 at 7:26 PM Magnus Larsen <magnusfynbo@xxxxxxxxxxx<mailto:magnusfynbo@xxxxxxxxxxx>> wrote: Hi Ceph-users! Ceph version: ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) Using cephadm to orchestrate the Ceph cluster I’m running into https://tracker.ceph.com/issues/59189, which is fixed in next version—quincy 17.2.7—via https://github.com/ceph/ceph/pull/50906 But I am unable to upgrade to the fixed version because of that bug When I try to upgrade (using “ceph orch upgrade start –image internal_mirror/ceph:v17.2.7”), we see the same error message: executing _write_files((['dkcphhpcadmin01', 'dkcphhpcmgt028', 'dkcphhpcmgt029', 'dkcphhpcmgt031', 'dkcphhpcosd033', 'dkcphhpcosd034', 'dkcphhpcosd035', 'dkcphhpcosd036', 'dkcphhpcosd037', 'dkcphhpcosd038', 'dkcphhpcosd039', 'dkcphhpcosd040', 'dkcphhpcosd041', 'dkcphhpcosd042', 'dkcphhpcosd043', 'dkcphhpcosd044'],)) failed. Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/ssh.py", line 240, in _write_remote_file conn = await self._remote_connection(host, addr) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp await source.run(srcpath) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run self.handle_error(exc) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in handle_error raise exc from None File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run await self._send_files(path, b'') File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in _send_files self.handle_error(exc) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in handle_error raise exc from None File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in _send_files await self._send_file(srcpath, dstpath, attrs) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in _send_file await self._make_cd_request(b'C', attrs, size, srcpath) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in _make_cd_request self._fs.basename(path)) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in make_request raise exc asyncssh.sftp.SFTPFailure: scp: /tmp/etc/ceph/ceph.conf.new: Permission denied During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/utils.py", line 79, in do_work return f(*arg) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1088, in _write_files self._write_client_files(client_files, host) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1107, in _write_client_files self.mgr.ssh.write_remote_file(host, path, content, mode, u id, gid) File "/usr/share/ceph/mgr/cephadm/ssh.py", line 261, in write_remote_file self.mgr.wait_async(self._write_remote_file( File "/usr/share/ceph/mgr/cephadm/module.py", line 615, in wait_async return self.event_loop.get_result(coro) File "/usr/share/ceph/mgr/cephadm/ssh.py", line 56, in get_result return asyncio.run_coroutine_threadsafe(coro, self._loop).result() File "/lib64/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/share/ceph/mgr/cephadm/ssh.py", line 249, in _write_remote_file logger.exception(msg) orchestrator._interface.OrchestratorError: Unable to write dkcphhpcmgt028:/etc/ceph/ceph.conf: scp: /tmp/etc/ceph/ceph.conf.new: Permission denied We were thinking about removing the keyring from the Ceph orchestrator (https://docs.ceph.com/en/latest/cephadm/operations/#putting-a-keyring-under-management), which would then make Ceph not try to copy over a new ceph.conf, alleviating the problem (https://docs.ceph.com/en/latest/cephadm/operations/#client-keyrings-and-configs), but in doing so, Ceph will kindly remove the key from all nodes (https://docs.ceph.com/en/latest/cephadm/operations/#disabling-management-of-a-keyring-file) leaving us without the admin keyring. So that doesn’t sound like a path we want to take :S Does anybody know how to get around this issue, so I can get to version where the issue fixed for good? Thanks, Magnus _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx