Re: Trouble converting to cephadm during upgrade

Eugen Block <eblock@xxxxxx> · Mon, 13 Dec 2021 10:10:45 +0000

Hi,

there were multiple reports in this list that sometimes the mgr daemon  
seems to quit working without an indication of a root cause. I have  
also experienced this quite a few times in my test clusters, failing  
the mgr seemed to help most of the time:

ceph mgr fail

As for the OSDs do you see attempts to build OSD containers? The logs  
could be of help here.

Regards,
Eugen

Zitat von Andre Goree <agoree@xxxxxxxxxxxxxxxxxx>:

Hello all.  I'm  upgrading a cluster from (Ubuntu 16.04) Luminous to  
Pacific, within which I've upgraded to (18.04) Nautilus, then to  
(20.04) Octopus.  The cluster ran flawlessly througout that upgrade  
process which I'm very happy about.

I'm now at the point of converting the cluster to cephadm (it was  
built with ceph-deploy), but I'm running into trouble.  I've  
followed this doc:  https://docs.ceph.com/en/latest/cephadm/adoption/

3 MON nodes
4 OSD nodes

The trouble is two-fold:  (1) it seems to be that once I've adopted  
the MON & MGR daemons, I can't seem to get the localhost MON to list  
with "ceph orch ps" only the two other MON nodes:

#### On MON node ####
root@cephmon01test:~# ceph orch ps
NAME               HOST           PORTS  STATUS         REFRESHED   
AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID
mgr.cephmon02test  cephmon02test         running (21h)     8m ago   
21h     365M        -  16.2.5   6933c2a0b7dd  e08de388b92e
mgr.cephmon03test  cephmon03test         running (21h)     6m ago   
21h     411M        -  16.2.5   6933c2a0b7dd  d358b697e49b
mon.cephmon02test  cephmon02test         running (21h)     8m ago     
-     934M    2048M  16.2.5   6933c2a0b7dd  f349d7cc6816
mon.cephmon03test  cephmon03test         running (21h)     6m ago     
-     923M    2048M  16.2.5   6933c2a0b7dd  64880b0659cc

root@cephmon01test:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr              2/0  8m ago     -    <unmanaged>
mon              2/0  8m ago     -    <unmanaged>

All of the 'cephadm adopt' commands for the MONs and MGRs were run  
from the above node.

My second issue is that when I proceed to adopt the OSDs (again,  
following https://docs.ceph.com/en/latest/cephadm/adoption/), they  
seem to drop out of the cluster:

### on OSD node ###
root@cephosd01test:~# cephadm ls
[
    {
        "style": "cephadm:v1",
        "name": "osd.0",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.0",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T00:19:24.799615Z",
        "configured": null
    },
    {
        "style": "cephadm:v1",
        "name": "osd.1",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.1",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T21:20:02.170515Z",
        "configured": null
    }
]

Ceph health snippet:
  services:
    mon: 3 daemons, quorum cephmon02test,cephmon03test,cephmon01test  
(age 21h)
    mgr: cephmon03test(active, since 21h), standbys: cephmon02test
    osd: 8 osds: 6 up (since 39m), 8 in
         flags noout

Is there a specific way to get those OSDs adopted by cephadm to be  
shown properly in the cluster and ceph orchestrator?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx