Re: Trouble converting to cephadm during upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

there were multiple reports in this list that sometimes the mgr daemon seems to quit working without an indication of a root cause. I have also experienced this quite a few times in my test clusters, failing the mgr seemed to help most of the time:

ceph mgr fail

As for the OSDs do you see attempts to build OSD containers? The logs could be of help here.

Regards,
Eugen


Zitat von Andre Goree <agoree@xxxxxxxxxxxxxxxxxx>:

Hello all. I'm upgrading a cluster from (Ubuntu 16.04) Luminous to Pacific, within which I've upgraded to (18.04) Nautilus, then to (20.04) Octopus. The cluster ran flawlessly througout that upgrade process which I'm very happy about.

I'm now at the point of converting the cluster to cephadm (it was built with ceph-deploy), but I'm running into trouble. I've followed this doc: https://docs.ceph.com/en/latest/cephadm/adoption/

3 MON nodes
4 OSD nodes

The trouble is two-fold: (1) it seems to be that once I've adopted the MON & MGR daemons, I can't seem to get the localhost MON to list with "ceph orch ps" only the two other MON nodes:

#### On MON node ####
root@cephmon01test:~# ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mgr.cephmon02test cephmon02test running (21h) 8m ago 21h 365M - 16.2.5 6933c2a0b7dd e08de388b92e mgr.cephmon03test cephmon03test running (21h) 6m ago 21h 411M - 16.2.5 6933c2a0b7dd d358b697e49b mon.cephmon02test cephmon02test running (21h) 8m ago - 934M 2048M 16.2.5 6933c2a0b7dd f349d7cc6816 mon.cephmon03test cephmon03test running (21h) 6m ago - 923M 2048M 16.2.5 6933c2a0b7dd 64880b0659cc

root@cephmon01test:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr              2/0  8m ago     -    <unmanaged>
mon              2/0  8m ago     -    <unmanaged>


All of the 'cephadm adopt' commands for the MONs and MGRs were run from the above node.

My second issue is that when I proceed to adopt the OSDs (again, following https://docs.ceph.com/en/latest/cephadm/adoption/), they seem to drop out of the cluster:

### on OSD node ###
root@cephosd01test:~# cephadm ls
[
    {
        "style": "cephadm:v1",
        "name": "osd.0",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.0",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T00:19:24.799615Z",
        "configured": null
    },
    {
        "style": "cephadm:v1",
        "name": "osd.1",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit": "ceph-4cfa6467-6647-41e9-8184-1cacc408265c@osd.1",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T21:20:02.170515Z",
        "configured": null
    }
]

Ceph health snippet:
  services:
mon: 3 daemons, quorum cephmon02test,cephmon03test,cephmon01test (age 21h)
    mgr: cephmon03test(active, since 21h), standbys: cephmon02test
    osd: 8 osds: 6 up (since 39m), 8 in
         flags noout

Is there a specific way to get those OSDs adopted by cephadm to be shown properly in the cluster and ceph orchestrator?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux