We found a fix for our issue ceph orch reporting wrong/outdated service information: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/DAFXD46NALFAFUBQEYODRIFWSD6SH2OL/ In our case our DNS settings were messed up on the cluster hosts AND also within the MGR daemon containers (cephadm deployed). Not sure, but I could imaging this could also mess with proper host detection. So, I guess it's worth it to at least confirm the settings on /etc/resolv.conf on all your hosts and MGR containers. Best, Mathias On 6/29/2022 5:59 PM, Mathias Kuhring wrote: > Hey all, > > just want to note that I'm also looking for some kind of way to > restart/reset/refresh orchestrator. > But in my case it's not the hosts but the services which are > presumably wrongly reported and outdated: > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/NHEVEM3ESJYXZ4LPJ24BBCK6NCG4QRHP/ > > > Don't know if this even can be related. > But in case you find a solution, I'll just stick around here and check > if I can apply it. > > Best, > Mathias > > On 6/27/2022 12:33 PM, Thomas Roth wrote: >> Hi Adam, >> >> no, this is the 'feature' where the reboot of a mgr hosts causes all >> known hosts to become unmanaged. >> >> >> > # lxbk0375 # ceph cephadm check-host lxbk0374 10.20.2.161 >> > mgr.server reply reply (1) Operation not permitted check-host failed: >> > Host 'lxbk0374' not found. Use 'ceph orch host ls' to see all >> managed hosts. >> >> In some email on this issue I can't find atm, someone describes a >> workaround that allows to restart the entire orchestrator business. >> But that sounded risky. >> >> Regards >> Thomsa >> >> >> On 23/06/2022 19.42, Adam King wrote: >>> Hi Thomas, >>> >>> What happens if you run "ceph cephadm check-host <hostname>" for one >>> of the >>> hosts that is offline (and if that fails "ceph cephadm check-host >>> <hostname> <ip-addr>")? Usually, the hosts get marked offline when >>> some ssh >>> connection to them fails. The check-host command will attempt a >>> connection >>> and maybe let us see why it's failing, or, if there is no longer an >>> issue >>> connecting to the host, should mark the host online again. >>> >>> Thanks, >>> - Adam King >>> >>> On Thu, Jun 23, 2022 at 12:30 PM Thomas Roth <t.roth@xxxxxx> wrote: >>> >>>> Hi all, >>>> >>>> found this bug https://tracker.ceph.com/issues/51629 (Octopus >>>> 15.2.13), >>>> reproduced it in Pacific and >>>> now again in Quincy: >>>> - new cluster >>>> - 3 mgr nodes >>>> - reboot active mgr node >>>> - (only in Quincy:) standby mgr node takes over, rebooted node becomse >>>> standby >>>> - `ceph orch host ls` shows all hosts as `offline` >>>> - add a new host: not offline >>>> >>>> In my setup, hostnames and IPs are well known, thus >>>> >>>> # ceph orch host ls >>>> HOST ADDR LABELS STATUS >>>> lxbk0374 10.20.2.161 _admin Offline >>>> lxbk0375 10.20.2.162 Offline >>>> lxbk0376 10.20.2.163 Offline >>>> lxbk0377 10.20.2.164 Offline >>>> lxbk0378 10.20.2.165 Offline >>>> lxfs416 10.20.2.178 Offline >>>> lxfs417 10.20.2.179 Offline >>>> lxfs418 10.20.2.222 Offline >>>> lxmds22 10.20.6.67 >>>> lxmds23 10.20.6.72 Offline >>>> lxmds24 10.20.6.74 Offline >>>> >>>> >>>> (All lxbk are mon nodes, the first 3 are mgr, 'lxmds22' was added >>>> after >>>> the fatal reboot.) >>>> >>>> >>>> Does this matter at all? >>>> The old bug report is one year old, now with prio 'Low'. And some >>>> people >>>> must have rebooted the one or >>>> other host in their clusters... >>>> >>>> There is a cephfs on our cluster, operations seem to be unaffected. >>>> >>>> >>>> Cheers >>>> Thomas >>>> >>>> -- >>>> -------------------------------------------------------------------- >>>> Thomas Roth >>>> Department: Informationstechnologie >>>> Location: SB3 2.291 >>>> >>>> >>>> GSI Helmholtzzentrum für Schwerionenforschung GmbH >>>> Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de >>>> >>>> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528 >>>> Managing Directors / Geschäftsführung: >>>> Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock >>>> Chairman of the Supervisory Board / Vorsitzender des >>>> GSI-Aufsichtsrats: >>>> State Secretary / Staatssekretär Dr. Volkmar Dietz >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> >> -- Mathias Kuhring Dr. rer. nat. Bioinformatician HPC & Core Unit Bioinformatics Berlin Institute of Health at Charité (BIH) E-Mail: mathias.kuhring@xxxxxxxxxxxxxx Mobile: +49 172 3475576 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx