Re: PGs unknown (osd down) after conversion to cephadm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sebastian,

 

as I said, the orchestrator does not seem to be reachable after cluster’s reboot. The requested output could only be gathered after manual restart of the osd containers. By the way, if I try to upgrade to v15.2.1 via cephadm (ceph orch upgrade start --version 15.2.1), I only get the output “ceph version 15.2.0 (dc6a0b5c3cbf6a5e1d6d4f20b5ad466d76b96247) octopus (rc)” and the upgrade does not start:

sudo ceph orch upgrade status

{

    "target_image": null,

    "in_progress": false,

    "services_complete": [],

    "message": ""

}

 

Maybe it’s time to open a ticket.

 

Here the requested outputs.

 

sudo ceph orch host ls --format json

 

[{"addr": "ceph1.domainname.de", "hostname": "ceph1.domainname.de", "labels": [], "status": ""}, {"addr": "ceph2.domainname.de", "hostname": "ceph2.domainname.de", "labels": [], "status": ""}, {"addr": "ceph3.domainname.de", "hostname": "ceph3.domainname.de", "labels": [], "status": ""}]

 

sudo ceph orch ls --format json

 

[{"container_image_id": "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", "container_image_name": "docker.io/ceph/ceph:v15", "service_name": "mds.media", "size": 3, "running": 3, "spec": {"placement": {"count": 3}, "service_type": "mds", "service_id": "media"}, "last_refresh": "2020-04-15T15:26:53.664473", "created": "2020-03-30T23:51:32.239555"}, {"container_image_id": "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", "container_image_name": "docker.io/ceph/ceph:v15", "service_name": "mgr", "size": 0, "running": 3, "last_refresh": "2020-04-15T15:26:53.664098"}, {"container_image_id": "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", "container_image_name": "docker.io/ceph/ceph:v15", "service_name": "mon", "size": 0, "running": 3, "last_refresh": "2020-04-15T15:26:53.664270"}]

 

Thanks,

 

Marco

 

 

Von: Sebastian Wagner
Gesendet: Dienstag, 14. April 2020 16:53
An: ceph-users@xxxxxxx
Betreff: Re: PGs unknown (osd down) after conversion to cephadm

 

Might be an issue with cephadm.

 

Do you have the output of `ceph orch host ls --format json` and `ceph

orch ls --format json`?

 

Am 09.04.20 um 13:23 schrieb Dr. Marco Savoca:

> Hi all,

>

>  

>

> last week I successfully upgraded my cluster to Octopus and converted it

> to cephadm. The conversion process (according to the docs) went well and

> the cluster ran in an active+clean status.

>

>  

>

> But after a reboot all osd went down with a delay of a couple of minutes

> after reboot and all (100%) of the PGs ran into the unknown state. The

> orchestrator isn’t reacheable during this state (ceph orch status

> doesn’t come to an end).

>

>  

>

> A manual restart of the osd-daemons resolved the problem and the cluster

> is now active+clean again.

>

>  

>

> This behavior is reproducible.

>

>  

>

>  

>

> The “ceph log last cephadm” command spits out (redacted):

>

>  

>

>  

>

> 2020-03-30T23:07:06.881061+0000 mgr.ceph2 (mgr.1854484) 42 : cephadm

> [INF] Generating ssh key...

>

> 2020-03-30T23:22:00.250422+0000 mgr.ceph2 (mgr.1854484) 492 : cephadm

> [ERR] _Promise failed

>

> Traceback (most recent call last):

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work

>

>     res = self._on_complete_(*args, **kwargs)

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>

>

>     return cls(_on_complete_=lambda x: f(*x), value=args, name=name,

> **c_kwargs)

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host

>

>     spec.hostname, spec.addr, err))

>

> orchestrator._interface.OrchestratorError: New host ceph1 (ceph1) failed

> check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is present',

> 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present',

> 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and running',

> 'ERROR: hostname "ceph1.domain.de" does not match expected hostname

> "ceph1"']

>

> 2020-03-30T23:22:27.267344+0000 mgr.ceph2 (mgr.1854484) 508 : cephadm

> [INF] Added host ceph1.domain.de

>

> 2020-03-30T23:22:36.078462+0000 mgr.ceph2 (mgr.1854484) 515 : cephadm

> [INF] Added host ceph2.domain.de

>

> 2020-03-30T23:22:55.200280+0000 mgr.ceph2 (mgr.1854484) 527 : cephadm

> [INF] Added host ceph3.domain.de

>

> 2020-03-30T23:23:17.491596+0000 mgr.ceph2 (mgr.1854484) 540 : cephadm

> [ERR] _Promise failed

>

> Traceback (most recent call last):

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work

>

>     res = self._on_complete_(*args, **kwargs)

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>

>

>     return cls(_on_complete_=lambda x: f(*x), value=args, name=name,

> **c_kwargs)

>

>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host

>

>     spec.hostname, spec.addr, err))

>

> orchestrator._interface.OrchestratorError: New host ceph1 (10.10.0.10)

> failed check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is

> present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is

> present', 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and

> running', 'ERROR: hostname "ceph1.domain.de" does not match expected

> hostname "ceph1"']

>

>  

>

> Could this be a problem with the ssh key?

>

>  

>

> Thanks for the help and happy eastern.

>

>  

>

> Marco Savoca

>

>  

>

>

> _______________________________________________

> ceph-users mailing list -- ceph-users@xxxxxxx

> To unsubscribe send an email to ceph-users-leave@xxxxxxx

>

 

--

SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany

(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer

 

 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux