Hi,
I saw something like this a couple of weeks ago on a customer cluster.
I'm not entirely sure, but this was either due to (yet) missing or
wrong cephadm ssh config or a label/client-keyring management issue.
If this is still an issue I would recommend to check the configured
keys to be managed by cephadm and the correct distribution of the
_admin label on your hosts:
$ ceph orch client-keyring ls
If this is an upgraded cluster it probably doesn't have keyrings
defined for management yet. But if you have the _admin label assigned
it might result in the seen error messages. As I said, I'm not
entirely sure. This is what I have in a Quincy cluster:
quincy-1:~ # ceph orch client-keyring ls
ENTITY PLACEMENT MODE OWNER PATH
client.admin label:_admin rw------- 0:0
/etc/ceph/ceph.client.admin.keyring
You can read more about it in the docs [1].
If this is not the issue, I would check the ssh config. Inspect the output of
$ ceph cephadm get-ssh-config
$ ceph cephadm get-pub-key
$ ceph cephadm get-user
Make sure the authorized_keys has correct entries.
Regards,
Eugen
[1]
https://docs.ceph.com/en/latest/cephadm/operations/?highlight=client-keyring#putting-a-keyring-under-management
Zitat von "Jesper Agerbo Krogh [JSKR]" <JSKR@xxxxxxxxxx>:
Hi.
We're currently getting these errors - and I seem to be missing a
clear overview over the cause and how to debug.
3/26/24 9:38:09 PM[ERR]executing _write_files((['dkcphhpcadmin01',
'dkcphhpcmgt028', 'dkcphhpcmgt029', 'dkcphhpcmgt031',
'dkcphhpcosd033', 'dkcphhpcosd034', 'dkcphhpcosd035',
'dkcphhpcosd036', 'dkcphhpcosd037', 'dkcphhpcosd038',
'dkcphhpcosd039', 'dkcphhpcosd040', 'dkcphhpcosd041',
'dkcphhpcosd042', 'dkcphhpcosd043', 'dkcphhpcosd044'],)) failed.
Traceback (most recent call last): File
"/usr/share/ceph/mgr/cephadm/ssh.py", line 240, in
_write_remote_file await asyncssh.scp(f.name, (conn, tmp_path)) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp
await source.run(srcpath) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run
self.handle_error(exc) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
handle_error raise exc from None File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run
await self._send_files(path, b'') File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
_send_files self.handle_error(exc) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
handle_error raise exc from None File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
_send_files await self._send_file(srcpath, dstpath, attrs) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
_send_file await self._make_cd_request(b'C', attrs, size, srcpath)
File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
_make_cd_request self._fs.basename(path)) File
"/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
make_request raise exc asyncssh.sftp.SFTPFailure: scp:
/tmp/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf.new:
Permission denied During handling of the above exception, another
exception occurred: Traceback (most recent call last): File
"/usr/share/ceph/mgr/cephadm/utils.py", line 79, in do_work return
f(*arg) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1088, in
_write_files self._write_client_files(client_files, host) File
"/usr/share/ceph/mgr/cephadm/serve.py", line 1107, in
_write_client_files self.mgr.ssh.write_remote_file(host, path,
content, mode, uid, gid) File "/usr/share/ceph/mgr/cephadm/ssh.py",
line 261, in write_remote_file host, path, content, mode, uid, gid,
addr)) File "/usr/share/ceph/mgr/cephadm/module.py", line 615, in
wait_async return self.event_loop.get_result(coro) File
"/usr/share/ceph/mgr/cephadm/ssh.py", line 56, in get_result return
asyncio.run_coroutine_threadsafe(coro, self._loop).result() File
"/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result() File
"/lib64/python3.6/concurrent/futures/_base.py", line 384, in
__get_result raise self._exception File
"/usr/share/ceph/mgr/cephadm/ssh.py", line 249, in
_write_remote_file raise OrchestratorError(msg)
orchestrator._interface.OrchestratorError: Unable to write
dkcphhpcmgt028:/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf: scp: /tmp/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf.new: Permission
denied
3/26/24 9:38:09 PM[ERR]Unable to write
dkcphhpcmgt028:/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf: scp: /tmp/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf.new: Permission denied Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/ssh.py", line 240, in _write_remote_file await asyncssh.scp(f.name, (conn, tmp_path)) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp await source.run(srcpath) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run self.handle_error(exc) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in handle_error raise exc from None File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run await self._send_files(path, b'') File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in _send_files self.handle_error(exc) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in handle_error raise exc from None File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in _send_files await self._send_file(srcpath, dstpath, attrs) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in _send_file await self._make_cd_request(b'C', attrs, size, srcpath) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in _make_cd_request self._fs.basename(path)) File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in make_request raise exc asyncssh.sftp.SFTPFailure: scp: /tmp/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf.new: Permission
denied
3/26/24 9:38:09 PM[INF]Updating
dkcphhpcmgt028:/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf
It seem to be related to the permissions that the manager writes the
files with and the process copying them around.
$ sudo ceph -v
[sudo] password for adminjskr:
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5)
quincy (stable)
Best regards,
Jesper Agerbo Krogh
Director Digitalization
Digitalization
Topsoe A/S
Haldor Topsøes Allé 1
2800 Kgs. Lyngby
Denmark
Phone (direct): 27773240
   
Read more attopsoe.com
Topsoe A/S and/or its affiliates. This e-mail message (including
attachments, if any) is confidential and may be privileged. It is
intended only for the addressee.
Any unauthorised distribution or disclosure is prohibited.
Disclosure to anyone other than the intended recipient does not
constitute waiver of privilege.
If you have received this email in error, please notify the sender
by email and delete it and any attachments from your computer system
and records.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx