Hi,
there were multiple reports in this list that sometimes the mgr daemon
seems to quit working without an indication of a root cause. I have
also experienced this quite a few times in my test clusters, failing
the mgr seemed to help most of the time:
ceph mgr fail
As for the OSDs do you see attempts to build OSD containers? The logs
could be of help here.
Regards,
Eugen
Zitat von Andre Goree <agoree@xxxxxxxxxxxxxxxxxx>:
Hello all. I'm upgrading a cluster from (Ubuntu 16.04) Luminous to
Pacific, within which I've upgraded to (18.04) Nautilus, then to
(20.04) Octopus. The cluster ran flawlessly througout that upgrade
process which I'm very happy about.
I'm now at the point of converting the cluster to cephadm (it was
built with ceph-deploy), but I'm running into trouble. I've
followed this doc: https://docs.ceph.com/en/latest/cephadm/adoption/
3 MON nodes
4 OSD nodes
The trouble is two-fold: (1) it seems to be that once I've adopted
the MON & MGR daemons, I can't seem to get the localhost MON to list
with "ceph orch ps" only the two other MON nodes:
#### On MON node ####
root@cephmon01test:~# ceph orch ps
NAME HOST PORTS STATUS REFRESHED
AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
mgr.cephmon02test cephmon02test running (21h) 8m ago
21h 365M - 16.2.5 6933c2a0b7dd e08de388b92e
mgr.cephmon03test cephmon03test running (21h) 6m ago
21h 411M - 16.2.5 6933c2a0b7dd d358b697e49b
mon.cephmon02test cephmon02test running (21h) 8m ago
- 934M 2048M 16.2.5 6933c2a0b7dd f349d7cc6816
mon.cephmon03test cephmon03test running (21h) 6m ago
- 923M 2048M 16.2.5 6933c2a0b7dd 64880b0659cc
root@cephmon01test:~# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
mgr 2/0 8m ago - <unmanaged>
mon 2/0 8m ago - <unmanaged>
All of the 'cephadm adopt' commands for the MONs and MGRs were run
from the above node.
My second issue is that when I proceed to adopt the OSDs (again,
following https://docs.ceph.com/en/latest/cephadm/adoption/), they
seem to drop out of the cluster:
### on OSD node ###
root@cephosd01test:~# cephadm ls
[
{
"style": "cephadm:v1",
"name": "osd.0",
"fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
"systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.0",
"enabled": true,
"state": "error",
"container_id": null,
"container_image_name": "ceph/ceph:v16",
"container_image_id": null,
"version": null,
"started": null,
"created": null,
"deployed": "2021-12-11T00:19:24.799615Z",
"configured": null
},
{
"style": "cephadm:v1",
"name": "osd.1",
"fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
"systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.1",
"enabled": true,
"state": "error",
"container_id": null,
"container_image_name": "ceph/ceph:v16",
"container_image_id": null,
"version": null,
"started": null,
"created": null,
"deployed": "2021-12-11T21:20:02.170515Z",
"configured": null
}
]
Ceph health snippet:
services:
mon: 3 daemons, quorum cephmon02test,cephmon03test,cephmon01test
(age 21h)
mgr: cephmon03test(active, since 21h), standbys: cephmon02test
osd: 8 osds: 6 up (since 39m), 8 in
flags noout
Is there a specific way to get those OSDs adopted by cephadm to be
shown properly in the cluster and ceph orchestrator?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx