Hi Sebastian, of course! I misspelled the option. Sometimes it’s difficult to see the forest for the trees… But after upgrade to 15.2.1 I have now the CEPHADM_STRAY_HOST problem: HEALTH_WARN 3 stray host(s) with 15 daemon(s) not managed by cephadm [WRN] CEPHADM_STRAY_HOST: 3 stray host(s) with 15 daemon(s) not managed by cephadm stray host ceph1 has 5 stray daemons: ['mds.media.ceph1.xzotzy', 'mgr.ceph1', 'mon.ceph1', 'osd.0', 'osd.1'] stray host ceph2 has 5 stray daemons: ['mds.media.ceph2.bitmic', 'mgr.ceph2', 'mon.ceph2', 'osd.2', 'osd.3'] stray host ceph3 has 5 stray daemons: ['mds.media.ceph3.rlxujb', 'mgr.ceph3', 'mon.ceph3', 'osd.4', 'osd.5'] Maybe related to the hostname vs. FQDN mismatch issue? My mon metadata (for one node): "name": "ceph1", "addrs": "[v2:10.10.0.10:3300/0,v1:10.10.0.10:6789/0]", "arch": "x86_64", "ceph_release": "octopus", "ceph_version": "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable)", "ceph_version_short": "15.2.1", "compression_algorithms": "none, snappy, zlib, zstd, lz4", "container_hostname": "ceph1.domainname.de", "container_image": "ceph/ceph:v15.2.1", "cpu": "Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz", "device_ids": "sda=INTEL_SSDSC2KB480G8_PHYF924001VB480BGN", "device_paths": "sda=/dev/disk/by-path/pci-0000:00:1f.2-ata-1", "devices": "sda", "distro": "centos", "distro_description": "CentOS Linux 8 (Core)", "distro_version": "8", "hostname": "ceph1", "kernel_description": "#201910180137 SMP Fri Oct 18 01:40:58 UTC 2019", "kernel_version": "4.19.80-041980-generic", "mem_swap_kb": "0", "mem_total_kb": "65936872", "os": "Linux" and again the output of ceph orch host ls: HOST ADDR LABELS STATUS ceph1.domainname.de ceph1.domainname.de ceph2.domainname.de ceph2.domainname.de ceph3.domainname.de ceph3.domainname.de Thx, Marco Von: Sebastian Wagner Hi Marco, # ceph orch upgrade start --ceph-version 15.2.1 should do the trick. Am 15.04.20 um 17:40 schrieb Dr. Marco Savoca: > Hi Sebastian, > > > > as I said, the orchestrator does not seem to be reachable after > cluster’s reboot. The requested output could only be gathered after > manual restart of the osd containers. By the way, if I try to upgrade to > v15.2.1 via cephadm (ceph orch upgrade start --version 15.2.1), I only > get the output “ceph version 15.2.0 > (dc6a0b5c3cbf6a5e1d6d4f20b5ad466d76b96247) octopus (rc)” and the upgrade > does not start: > > sudo ceph orch upgrade status > > { > > "target_image": null, > > "in_progress": false, > > "services_complete": [], > > "message": "" > > } > > > > Maybe it’s time to open a ticket. > > > > Here the requested outputs. > > > > sudo ceph orch host ls --format json > > > > [{"addr": "ceph1.domainname.de", "hostname": "ceph1.domainname.de", > "labels": [], "status": ""}, {"addr": "ceph2.domainname.de", "hostname": > "ceph2.domainname.de", "labels": [], "status": ""}, {"addr": > "ceph3.domainname.de", "hostname": "ceph3.domainname.de", "labels": [], > "status": ""}] > > > > sudo ceph orch ls --format json > > > > [{"container_image_id": > "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", > "container_image_name": "docker.io/ceph/ceph:v15", "service_name": > "mds.media", "size": 3, "running": 3, "spec": {"placement": {"count": > 3}, "service_type": "mds", "service_id": "media"}, "last_refresh": > "2020-04-15T15:26:53.664473", "created": "2020-03-30T23:51:32.239555"}, > {"container_image_id": > "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", > "container_image_name": "docker.io/ceph/ceph:v15", "service_name": > "mgr", "size": 0, "running": 3, "last_refresh": > "2020-04-15T15:26:53.664098"}, {"container_image_id": > "204a01f9b0b6710dd0c0af7f37ce7139c47ff0f0105d778d7104c69282dfbbf1", > "container_image_name": "docker.io/ceph/ceph:v15", "service_name": > "mon", "size": 0, "running": 3, "last_refresh": > "2020-04-15T15:26:53.664270"}] > > > > Thanks, > > > > Marco > > > > > > *Von: *Sebastian Wagner <mailto:swagner@xxxxxxxx> > *Gesendet: *Dienstag, 14. April 2020 16:53 > *An: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> > *Betreff: * Re: PGs unknown (osd down) after conversion to > cephadm > > > > Might be an issue with cephadm. > > > > Do you have the output of `ceph orch host ls --format json` and `ceph > > orch ls --format json`? > > > > Am 09.04.20 um 13:23 schrieb Dr. Marco Savoca: > >> Hi all, > >> > >> > >> > >> last week I successfully upgraded my cluster to Octopus and converted it > >> to cephadm. The conversion process (according to the docs) went well and > >> the cluster ran in an active+clean status. > >> > >> > >> > >> But after a reboot all osd went down with a delay of a couple of minutes > >> after reboot and all (100%) of the PGs ran into the unknown state. The > >> orchestrator isn’t reacheable during this state (ceph orch status > >> doesn’t come to an end). > >> > >> > >> > >> A manual restart of the osd-daemons resolved the problem and the cluster > >> is now active+clean again. > >> > >> > >> > >> This behavior is reproducible. > >> > >> > >> > >> > >> > >> The “ceph log last cephadm” command spits out (redacted): > >> > >> > >> > >> > >> > >> 2020-03-30T23:07:06.881061+0000 mgr.ceph2 (mgr.1854484) 42 : cephadm > >> [INF] Generating ssh key... > >> > >> 2020-03-30T23:22:00.250422+0000 mgr.ceph2 (mgr.1854484) 492 : cephadm > >> [ERR] _Promise failed > >> > >> Traceback (most recent call last): > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work > >> > >> res = self._on_complete_(*args, **kwargs) > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda> > >> > >> return cls(_on_complete_=lambda x: f(*x), value=args, name=name, > >> **c_kwargs) > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host > >> > >> spec.hostname, spec.addr, err)) > >> > >> orchestrator._interface.OrchestratorError: New host ceph1 (ceph1) failed > >> check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is present', > >> 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', > >> 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and running', > >> 'ERROR: hostname "ceph1.domain.de" does not match expected hostname > >> "ceph1"'] > >> > >> 2020-03-30T23:22:27.267344+0000 mgr.ceph2 (mgr.1854484) 508 : cephadm > >> [INF] Added host ceph1.domain.de > >> > >> 2020-03-30T23:22:36.078462+0000 mgr.ceph2 (mgr.1854484) 515 : cephadm > >> [INF] Added host ceph2.domain.de > >> > >> 2020-03-30T23:22:55.200280+0000 mgr.ceph2 (mgr.1854484) 527 : cephadm > >> [INF] Added host ceph3.domain.de > >> > >> 2020-03-30T23:23:17.491596+0000 mgr.ceph2 (mgr.1854484) 540 : cephadm > >> [ERR] _Promise failed > >> > >> Traceback (most recent call last): > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work > >> > >> res = self._on_complete_(*args, **kwargs) > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda> > >> > >> return cls(_on_complete_=lambda x: f(*x), value=args, name=name, > >> **c_kwargs) > >> > >> File "/usr/share/ceph/mgr/cephadm/module.py", line 1648, in add_host > >> > >> spec.hostname, spec.addr, err)) > >> > >> orchestrator._interface.OrchestratorError: New host ceph1 (10.10.0.10) > >> failed check: ['INFO:cephadm:podman|docker (/usr/bin/docker) is > >> present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is > >> present', 'INFO:cephadm:Unit systemd-timesyncd.service is enabled and > >> running', 'ERROR: hostname "ceph1.domain.de" does not match expected > >> hostname "ceph1"'] > >> > >> > >> > >> Could this be a problem with the ssh key? > >> > >> > >> > >> Thanks for the help and happy eastern. > >> > >> > >> > >> Marco Savoca > >> > >> > >> > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > > > -- > > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > > (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer > > > > > -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer |
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx