Re: Problems adding a new host via orchestration.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have verified the server's expected hostname (with `hostname`) matches the hostname I am trying to use.
Just to be sure, I also ran:
    cephadm check-host --expect-hostname <hostname>
and it returns:
    Hostname "<hostname>" matches what is expected.

On the current admin server where I am trying to add the host, the host is reachable, the shortname even matches proper IP with dns search order. Likewise, on the server where the mgr is running, I am able to confirm reachability and DNS resolution for the new server as well.

I thought this may be a DNS/name resolution issue as well, but I don't see any errors in my setup wrt to host naming.

Thanks
Gary


On 2024-02-03 06:46, Eugen Block wrote:
Hi,

I found this blog post [1] which reports the same error message. It seems a bit misleading because it appears to be about DNS. Can you check

cephadm check-host --expect-hostname <HOSTNAME>

Or is that what you already tried? It's not entirely clear how you checked the hostname.

Regards,
Eugen

[1] https://blog.mousetech.com/ceph-distributed-file-system-for-the-enterprise/ceph-bogus-error-cannot-allocate-memory/

Zitat von Gary Molenkamp <molenkam@xxxxxx>:

Happy Friday all.  I was hoping someone could point me in the right direction or clarify any limitations that could be impacting an issue I am having.

I'm struggling to add a new set of hosts to my ceph cluster using cephadm and orchestration.  When trying to add a host:
    "ceph orch host add <hostname> 172.31.102.41 --labels _admin"
returns:
    "Error EINVAL: Can't communicate with remote host `172.31.102.41`, possibly because python3 is not installed there: [Errno 12] Cannot allocate memory"

I've verified that the ceph ssh key works to the remote host, host's name matches that returned from `hostname`, python3 is installed, and "/usr/sbin/cephadm prepare-host" on the new hosts returns "host is ok".    In addition, the cluster ssh key works between hosts and the existing hosts are able to ssh in using the ceph key.

The existing ceph cluster is Pacific release using docker based containerization on RockyLinux8 base OS.  The new hosts are RockyLinux9 based, with the cephadm being installed from Quincy release:
        ./cephadm add-repo --release quincy
        ./cephadm install
I did try installing cephadm from the Pacific release by changing the repo to el8,  but that did not work either.

Is there a limitation is mixing RL8 and RL9 container hosts under Pacific?  Does this same limitation exist under Quincy? Is there a python version dependency? The reason for RL9 on the new hosts is to stage upgrading the OS's for the cluster.  I did this under Octopus for moving from Centos7 to RL8.

Thanks and I appreciate any feedback/pointers.
Gary


I've added the log trace here in case that helps (from `ceph log last cephadm`)



2024-02-02T14:22:32.610048+0000 mgr.storage01.oonvfl (mgr.441023307) 4957871 : cephadm [ERR] Can't communicate with remote host `172.31.102.41`, possibly because python3 is not installed there: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1524, in _remote_connection
    conn, connr = self.mgr._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1370, in _get_connection
    sudo=True if self.ssh_user != 'root' else False)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 35, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 46, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 133, in makegateway
    io = gateway_io.create_io(spec, execmodel=self.execmodel)
  File "/lib/python3.6/site-packages/execnet/gateway_io.py", line 121, in create_io
    io = Popen2IOMaster(args, execmodel)
  File "/lib/python3.6/site-packages/execnet/gateway_io.py", line 21, in __init__
    self.popen = p = execmodel.PopenPiped(args)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 184, in PopenPiped
    return self.subprocess.Popen(args, stdout=PIPE, stdin=PIPE)
  File "/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/lib64/python3.6/subprocess.py", line 1295, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1528, in _remote_connection
    raise execnet.gateway_bootstrap.HostNotFound(msg)
execnet.gateway_bootstrap.HostNotFound: Can't communicate with remote host `172.31.102.41`, possibly because python3 is not installed there: [Errno 12] Cannot allocate memory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
    return OrchResult(f(*args, **kwargs))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2709, in apply
    results.append(self._apply(spec))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2574, in _apply
    return self._add_host(cast(HostSpec, spec))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1517, in _add_host
    ip_addr = self._check_valid_addr(spec.hostname, spec.addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1498, in _check_valid_addr
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1326, in _run_cephadm
    with self._remote_connection(host, addr) as tpl:
  File "/lib64/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1558, in _remote_connection
    raise OrchestratorError(msg) from e
orchestrator._interface.OrchestratorError: Can't communicate with remote host `172.31.102.41`, possibly because python3 is not installed there: [Errno 12] Cannot allocate memory




--
Gary Molenkamp            Science Technology Services
Systems Engineer        University of Western Ontario
molenkam@xxxxxx                 http://sts.sci.uwo.ca
(519) 661-2111 x86882        (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Gary Molenkamp			Science Technology Services
Systems Engineer		University of Western Ontario
molenkam@xxxxxx                 http://sts.sci.uwo.ca
(519) 661-2111 x86882		(519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux