Might be an issue with cephadm. Do you have the output of `ceph orch host ls --format json` and `ceph orch ls --format json`? Am 09.04.20 um 13:23 schrieb Dr. Marco Savoca: > Hi all, > > > > last week I successfully upgraded my cluster to Octopus and converted it > to cephadm. The conversion process (according to the docs) went well and > the cluster ran in an active+clean status. > > > > But after a reboot all osd went down with a delay of a couple of minutes > after reboot and all (100%) of the PGs ran into the unknown state. The > orchestrator isn’t reacheable during this state (ceph orch status > doesn’t come to an end). > > > > A manual restart of the osd-daemons resolved the problem and the cluster > is now active+clean again. > > > > This behavior is reproducible. > > > > > > The “ceph log last cephadm” command spits out (redacted): > > > > > > 2020-03-30T23:07:06.881061+0000 mgr.ceph2 (mgr.1854484) 42 : cephadm > [INF] Generating ssh key... > > 2020-03-30T23:22:00.250422+0000 mgr.ceph2 (mgr.1854484) 492 : cephadm > [ERR] _Promise failed > > Traceback (most recent call last): > > File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work > > res = self._on_complete_(*args, **kwargs) > > File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda> > > return cls(on_complete=lambda x: f(*x), value=args, name=name, > **c_kwargs) > > File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host > > spec.hostname, spec.addr, err)) > > orchestrator._interface.OrchestratorError: New host ceph1 (ceph1) failed > check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is present', > 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', > 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and running', > 'ERROR: hostname "ceph1.domain.de" does not match expected hostname > "ceph1"'] > > 2020-03-30T23:22:27.267344+0000 mgr.ceph2 (mgr.1854484) 508 : cephadm > [INF] Added host ceph1.domain.de > > 2020-03-30T23:22:36.078462+0000 mgr.ceph2 (mgr.1854484) 515 : cephadm > [INF] Added host ceph2.domain.de > > 2020-03-30T23:22:55.200280+0000 mgr.ceph2 (mgr.1854484) 527 : cephadm > [INF] Added host ceph3.domain.de > > 2020-03-30T23:23:17.491596+0000 mgr.ceph2 (mgr.1854484) 540 : cephadm > [ERR] _Promise failed > > Traceback (most recent call last): > > File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work > > res = self._on_complete_(*args, **kwargs) > > File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda> > > return cls(on_complete=lambda x: f(*x), value=args, name=name, > **c_kwargs) > > File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host > > spec.hostname, spec.addr, err)) > > orchestrator._interface.OrchestratorError: New host ceph1 (10.10.0.10) > failed check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is > present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is > present', 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and > running', 'ERROR: hostname "ceph1.domain.de" does not match expected > hostname "ceph1"'] > > > > Could this be a problem with the ssh key? > > > > Thanks for the help and happy eastern. > > > > Marco Savoca > > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx