Happy Friday all. I was hoping someone could point me in the right
direction or clarify any limitations that could be impacting an
issue I am having.
I'm struggling to add a new set of hosts to my ceph cluster using
cephadm and orchestration. When trying to add a host:
"ceph orch host add <hostname> 172.31.102.41 --labels _admin"
returns:
"Error EINVAL: Can't communicate with remote host
`172.31.102.41`, possibly because python3 is not installed there:
[Errno 12] Cannot allocate memory"
I've verified that the ceph ssh key works to the remote host, host's
name matches that returned from `hostname`, python3 is installed,
and "/usr/sbin/cephadm prepare-host" on the new hosts returns "host
is ok". In addition, the cluster ssh key works between hosts and
the existing hosts are able to ssh in using the ceph key.
The existing ceph cluster is Pacific release using docker based
containerization on RockyLinux8 base OS. The new hosts are
RockyLinux9 based, with the cephadm being installed from Quincy
release:
./cephadm add-repo --release quincy
./cephadm install
I did try installing cephadm from the Pacific release by changing
the repo to el8, but that did not work either.
Is there a limitation is mixing RL8 and RL9 container hosts under
Pacific? Does this same limitation exist under Quincy? Is there a
python version dependency?
The reason for RL9 on the new hosts is to stage upgrading the OS's
for the cluster. I did this under Octopus for moving from Centos7
to RL8.
Thanks and I appreciate any feedback/pointers.
Gary
I've added the log trace here in case that helps (from `ceph log
last cephadm`)
2024-02-02T14:22:32.610048+0000 mgr.storage01.oonvfl (mgr.441023307)
4957871 : cephadm [ERR] Can't communicate with remote host
`172.31.102.41`, possibly because python3 is not installed there:
[Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/serve.py", line 1524, in
_remote_connection
conn, connr = self.mgr._get_connection(addr)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1370, in _get_connection
sudo=True if self.ssh_user != 'root' else False)
File "/lib/python3.6/site-packages/remoto/backends/__init__.py",
line 35, in __init__
self.gateway = self._make_gateway(hostname)
File "/lib/python3.6/site-packages/remoto/backends/__init__.py",
line 46, in _make_gateway
self._make_connection_string(hostname)
File "/lib/python3.6/site-packages/execnet/multi.py", line 133, in
makegateway
io = gateway_io.create_io(spec, execmodel=self.execmodel)
File "/lib/python3.6/site-packages/execnet/gateway_io.py", line
121, in create_io
io = Popen2IOMaster(args, execmodel)
File "/lib/python3.6/site-packages/execnet/gateway_io.py", line
21, in __init__
self.popen = p = execmodel.PopenPiped(args)
File "/lib/python3.6/site-packages/execnet/gateway_base.py", line
184, in PopenPiped
return self.subprocess.Popen(args, stdout=PIPE, stdin=PIPE)
File "/lib64/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/lib64/python3.6/subprocess.py", line 1295, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/serve.py", line 1528, in
_remote_connection
raise execnet.gateway_bootstrap.HostNotFound(msg)
execnet.gateway_bootstrap.HostNotFound: Can't communicate with
remote host `172.31.102.41`, possibly because python3 is not
installed there: [Errno 12] Cannot allocate memory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
return OrchResult(f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2709, in apply
results.append(self._apply(spec))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2574, in _apply
return self._add_host(cast(HostSpec, spec))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1517, in _add_host
ip_addr = self._check_valid_addr(spec.hostname, spec.addr)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1498, in
_check_valid_addr
error_ok=True, no_fsid=True)
File "/usr/share/ceph/mgr/cephadm/serve.py", line 1326, in _run_cephadm
with self._remote_connection(host, addr) as tpl:
File "/lib64/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/usr/share/ceph/mgr/cephadm/serve.py", line 1558, in
_remote_connection
raise OrchestratorError(msg) from e
orchestrator._interface.OrchestratorError: Can't communicate with
remote host `172.31.102.41`, possibly because python3 is not
installed there: [Errno 12] Cannot allocate memory
--
Gary Molenkamp Science Technology Services
Systems Engineer University of Western Ontario
molenkam@xxxxxx http://sts.sci.uwo.ca
(519) 661-2111 x86882 (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx