Re: Problems adding a new host via orchestration. (solved)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oh really? That seems rather trivial but alright. :-D

Zitat von Gary Molenkamp <molenkam@xxxxxx>:

Just wanted to follow up on this to say that it is now working.

After reviewing the configuration of the new host many times, I did a hard restart of the active mrg container.
The command to add the new host proceeded without error.

Thanks everyone.
Gary



On 2024-02-06 16:01, Tim Holloway wrote:
[You don't often get email from timh@xxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

Just FYI, I've seen this on CentOS systems as well, and I'm not even
sure that it was just for Ceph. Maybe some stuff like Ansible.

I THINK you can safely ignore that message or alternatively that it's
such an easy fix that senility has already driven it from my mind.

    Tim

On Tue, 2024-02-06 at 14:44 -0500, Gary Molenkamp wrote:
I confirmed selinux is disabled on all existing and new hosts.
Likewise,
python3.6 is installed on all as well.  (3.9.16 on RL8, 3.9.18 on
RL9).

I am running 16.2.12 on all containers, so it may be worth updating
to
16.2.14 to ensure I'm on the latest Pacific release.

Gary


On 2024-02-05 08:17, Curt wrote:

You don't often get email from lightspd@xxxxxxxxx. Learn why this
is
important <https://aka.ms/LearnAboutSenderIdentification>


I don't use rocky, so stab in the dark and probably not the issue,
but
could selinux be blocking the process?  Really long shot, but
python3
is in the standard location? So if you run python3 --version as
your
ceph user it returns?

Probably not much help, but figured I'd throw it out there.

On Mon, 5 Feb 2024, 16:54 Gary Molenkamp, <molenkam@xxxxxx> wrote:

    I have verified the server's expected hostname (with
`hostname`)
    matches
    the hostname I am trying to use.
    Just to be sure, I also ran:
         cephadm check-host --expect-hostname <hostname>
    and it returns:
         Hostname "<hostname>" matches what is expected.

    On the current admin server where I am trying to add the host,
the
    host
    is reachable, the shortname even matches proper IP with dns
search
    order.
    Likewise, on the server where the mgr is running, I am able to
    confirm
    reachability and DNS resolution for the new server as well.

    I thought this may be a DNS/name resolution issue as well, but
I
    don't
    see any errors in my setup wrt to host naming.

    Thanks
    Gary


    On 2024-02-03 06:46, Eugen Block wrote:
    > Hi,
    >
    > I found this blog post [1] which reports the same error
message. It
    > seems a bit misleading because it appears to be about DNS.
Can
    you check
    >
    > cephadm check-host --expect-hostname <HOSTNAME>
    >
    > Or is that what you already tried? It's not entirely clear
how you
    > checked the hostname.
    >
    > Regards,
    > Eugen
    >
    > [1]
    >

https://blog.mousetech.com/ceph-distributed-file-system-for-the-enterprise/ceph-bogus-error-cannot-allocate-memory/
    >
    > Zitat von Gary Molenkamp <molenkam@xxxxxx>:
    >
    >> Happy Friday all.  I was hoping someone could point me in
the
    right
    >> direction or clarify any limitations that could be impacting
an
    issue
    >> I am having.
    >>
    >> I'm struggling to add a new set of hosts to my ceph cluster
using
    >> cephadm and orchestration.  When trying to add a host:
    >>     "ceph orch host add <hostname> 172.31.102.41 --labels
_admin"
    >> returns:
    >>     "Error EINVAL: Can't communicate with remote host
    >> `172.31.102.41`, possibly because python3 is not installed
there:
    >> [Errno 12] Cannot allocate memory"
    >>
    >> I've verified that the ceph ssh key works to the remote
host,
    host's
    >> name matches that returned from `hostname`, python3 is
    installed, and
    >> "/usr/sbin/cephadm prepare-host" on the new hosts returns
"host is
    >> ok".    In addition, the cluster ssh key works between hosts
    and the
    >> existing hosts are able to ssh in using the ceph key.
    >>
    >> The existing ceph cluster is Pacific release using docker
based
    >> containerization on RockyLinux8 base OS.  The new hosts are
    >> RockyLinux9 based, with the cephadm being installed from
Quincy
    release:
    >>         ./cephadm add-repo --release quincy
    >>         ./cephadm install
    >> I did try installing cephadm from the Pacific release by
    changing the
    >> repo to el8,  but that did not work either.
    >>
    >> Is there a limitation is mixing RL8 and RL9 container hosts
under
    >> Pacific?  Does this same limitation exist under Quincy? Is
there a
    >> python version dependency?
    >> The reason for RL9 on the new hosts is to stage upgrading
the OS's
    >> for the cluster.  I did this under Octopus for moving from
    Centos7 to
    >> RL8.
    >>
    >> Thanks and I appreciate any feedback/pointers.
    >> Gary
    >>
    >>
    >> I've added the log trace here in case that helps (from `ceph
    log last
    >> cephadm`)
    >>
    >>
    >>
    >> 2024-02-02T14:22:32.610048+0000 mgr.storage01.oonvfl
    (mgr.441023307)
    >> 4957871 : cephadm [ERR] Can't communicate with remote host
    >> `172.31.102.41`, possibly because python3 is not installed
there:
    >> [Errno 12] Cannot allocate memory
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1524, in
    >> _remote_connection
    >>     conn, connr = self.mgr._get_connection(addr)
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1370,
in
    >> _get_connection
    >>     sudo=True if self.ssh_user != 'root' else False)
    >>   File "/lib/python3.6/site-
packages/remoto/backends/__init__.py",
    >> line 35, in __init__
    >>     self.gateway = self._make_gateway(hostname)
    >>   File "/lib/python3.6/site-
packages/remoto/backends/__init__.py",
    >> line 46, in _make_gateway
    >>     self._make_connection_string(hostname)
    >>   File "/lib/python3.6/site-packages/execnet/multi.py", line
    133, in
    >> makegateway
    >>     io = gateway_io.create_io(spec,
execmodel=self.execmodel)
    >>   File "/lib/python3.6/site-packages/execnet/gateway_io.py",
line
    >> 121, in create_io
    >>     io = Popen2IOMaster(args, execmodel)
    >>   File "/lib/python3.6/site-packages/execnet/gateway_io.py",
    line 21,
    >> in __init__
    >>     self.popen = p = execmodel.PopenPiped(args)
    >>   File "/lib/python3.6/site-
packages/execnet/gateway_base.py",
    line
    >> 184, in PopenPiped
    >>     return self.subprocess.Popen(args, stdout=PIPE,
stdin=PIPE)
    >>   File "/lib64/python3.6/subprocess.py", line 729, in
__init__
    >>     restore_signals, start_new_session)
    >>   File "/lib64/python3.6/subprocess.py", line 1295, in
    _execute_child
    >>     restore_signals, start_new_session, preexec_fn)
    >> OSError: [Errno 12] Cannot allocate memory
    >>
    >> During handling of the above exception, another exception
occurred:
    >>
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1528, in
    >> _remote_connection
    >>     raise execnet.gateway_bootstrap.HostNotFound(msg)
    >> execnet.gateway_bootstrap.HostNotFound: Can't communicate
with
    remote
    >> host `172.31.102.41`, possibly because python3 is not
installed
    >> there: [Errno 12] Cannot allocate memory
    >>
    >> The above exception was the direct cause of the following
    exception:
    >>
    >> Traceback (most recent call last):
    >>   File "/usr/share/ceph/mgr/orchestrator/_interface.py",
line
    125, in
    >> wrapper
    >>     return OrchResult(f(*args, **kwargs))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 2709,
in apply
    >>     results.append(self._apply(spec))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 2574,
in
    _apply
    >>     return self._add_host(cast(HostSpec, spec))
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1517,
in
    _add_host
    >>     ip_addr = self._check_valid_addr(spec.hostname,
spec.addr)
    >>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1498,
in
    >> _check_valid_addr
    >>     error_ok=True, no_fsid=True)
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1326, in
    >> _run_cephadm
    >>     with self._remote_connection(host, addr) as tpl:
    >>   File "/lib64/python3.6/contextlib.py", line 81, in
__enter__
    >>     return next(self.gen)
    >>   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1558, in
    >> _remote_connection
    >>     raise OrchestratorError(msg) from e
    >> orchestrator._interface.OrchestratorError: Can't communicate
with
    >> remote host `172.31.102.41`, possibly because python3 is not
    >> installed there: [Errno 12] Cannot allocate memory
    >>
    >>
    >>
    >>
    >> --
    >> Gary Molenkamp            Science Technology Services
    >> Systems Engineer        University of Western Ontario
    >> molenkam@xxxxxx http://sts.sci.uwo.ca
    >> (519) 661-2111 x86882        (519) 661-3566
    >> _______________________________________________
    >> ceph-users mailing list -- ceph-users@xxxxxxx
    >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
    >
    >
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@xxxxxxx
    > To unsubscribe send an email to ceph-users-leave@xxxxxxx

    --
    Gary Molenkamp                  Science Technology Services
    Systems Engineer                University of Western Ontario
    molenkam@xxxxxx http://sts.sci.uwo.ca
    (519) 661-2111 x86882           (519) 661-3566
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Gary Molenkamp                  Science Technology Services
Systems Engineer                University of Western Ontario
molenkam@xxxxxx                  http://sts.sci.uwo.ca
(519) 661-2111 x86882           (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Gary Molenkamp			Science Technology Services
Systems Engineer		University of Western Ontario
molenkam@xxxxxx                 http://sts.sci.uwo.ca
(519) 661-2111 x86882		(519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux