Re: Unsetting maintenance mode for failed host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

if you just want the cluster to drain this host but bring it back online soon I would just remove the noout flag:

ceph osd rm-noout osd1

This flag is set when entering maintenance mode (ceph osd add-noout <HOST>). But it would not remove the health warning (host is in maintenance) until the host is back. One (potentially dangerous) way to clear that warning would be to remove the host from the host list (ceph orch host rm <host>), but this can potentially cause data loss since the entire host will be removed from the crushmap. So I would only choose that path after all PGs have been backfilled successfully and your attempts to bring that host back takes longer than expected.

Maybe there are more options which I haven't thought of, but these two came to mind.

Regards,
Eugen

Zitat von Bryce Nicholls <Bryce.Nicholls92@xxxxxxxxxxxxxxx>:

Hi

We put a host in maintenance and had issues bringing it back.
Is there a safe way of exiting maintenance while the host is unreachable / offline? We would like the cluster to rebalance while we are working to get this host back online.

Maintenance was set using:
ceph orch host maintenance enter osd1

I tried exiting using:
ceph orch host maintenance exit osd1

but got the below stacktrace.

root@mon1 ~ # ceph orch host maintenance exit osd1

Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1756, in _handle_command
    return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
    return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
    return func(*args, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/module.py", line 455, in _host_maintenance_exit
    raise_if_exception(completion)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception
    e = pickle.loads(c.serialized_exception)
TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr'


Thanks
Bryce


Bryce Nicholls
OpenStack Engineer
Bryce.Nicholls92@xxxxxxxxxxxxxxx
[THG Ingenuity Logo]<https://www.thg.com>
[https://i.imgur.com/wbpVRW6.png]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk> [https://i.imgur.com/c3040tr.png] <https://twitter.com/thgplc?lang=en>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux