Converting to cephadm from ceph-deploy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



First off, I made a similar post on 12/11/21 but had not explicitly signed up for the new mailing list (this email is a remnant from when the list was run with mailman) and I didn't get a reply here and couldn't reply, so I have to make this again, I apologize of the noise).


Hello all.  I'm  upgrading a cluster from (Ubuntu 16.04) Luminous to Pacific, within
which I've upgraded to (18.04) Nautilus, then to (20.04) Octopus.  The cluster ran
flawlessly througout that upgrade process which I'm very happy about.

I'm now at the point of converting the cluster to cephadm (it was built with
ceph-deploy), but I'm running into trouble.  I've followed this doc: 
https://docs.ceph.com/en/latest/cephadm/adoption/

3 MON nodes
4 OSD nodes

The trouble is two-fold:  (1) it seems to be that once I've adopted the MON & MGR
daemons, I can't seem to get the localhost MON to list with "ceph orch ps"
only the two other MON nodes:

#### On MON node ####
root@cephmon01test:~# ceph orch ps
NAME               HOST           PORTS  STATUS         REFRESHED  AGE  MEM USE  MEM LIM 
VERSION  IMAGE ID      CONTAINER ID  
mgr.cephmon02test  cephmon02test         running (21h)     8m ago  21h     365M        - 
16.2.5   6933c2a0b7dd  e08de388b92e  
mgr.cephmon03test  cephmon03test         running (21h)     6m ago  21h     411M        - 
16.2.5   6933c2a0b7dd  d358b697e49b  
mon.cephmon02test  cephmon02test         running (21h)     8m ago    -     934M    2048M 
16.2.5   6933c2a0b7dd  f349d7cc6816  
mon.cephmon03test  cephmon03test         running (21h)     6m ago    -     923M    2048M 
16.2.5   6933c2a0b7dd  64880b0659cc  

root@cephmon01test:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT    
mgr              2/0  8m ago     -    <unmanaged>  
mon              2/0  8m ago     -    <unmanaged> 


All of the 'cephadm adopt' commands for the MONs and MGRs were run from the above
node.

My second issue is that when I proceed to adopt the OSDs (again, following
https://docs.ceph.com/en/latest/cephadm/adoption/), they seem to drop out of the cluster:

### on OSD node ###
root@cephosd01test:~# cephadm ls
[
    {
        "style": "cephadm:v1",
        "name": "osd.0",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit":
&quot;ceph-4cfa6467-6647-41e9-8184-1cacc408265c(a)osd.0&quot;sd.0",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T00:19:24.799615Z",
        "configured": null
    },
    {
        "style": "cephadm:v1",
        "name": "osd.1",
        "fsid": "4cfa6467-6647-41e9-8184-1cacc408265c",
        "systemd_unit":
&quot;ceph-4cfa6467-6647-41e9-8184-1cacc408265c(a)osd.1&quot;sd.1",
        "enabled": true,
        "state": "error",
        "container_id": null,
        "container_image_name": "ceph/ceph:v16",
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": "2021-12-11T21:20:02.170515Z",
        "configured": null
    }
]

Ceph health snippet:
  services:
    mon: 3 daemons, quorum cephmon02test,cephmon03test,cephmon01test (age 21h)
    mgr: cephmon03test(active, since 21h), standbys: cephmon02test
    osd: 8 osds: 6 up (since 39m), 8 in
         flags noout

Is there a specific way to get those OSDs adopted by cephadm to be shown properly in the
cluster and ceph orchestrator?

I asked the same question elsewhere and was asked if I could see my containers running, I have a reply for that:

Further background info, this cluster was build with 'ceph-deploy' on 12.2.4, I'm not sure if that's an issue _specifically_ for the conversion to cephadm, but I've been able to upgrade from Ubuntu Xenial & Luminous to Ubuntu Focal & Pacific -- it's just this conversion to cephadm that I'm having the issue with. This cluster is _only_ used for RBD devices (via Libvirt).

When I run "bash -x /var/lib/ceph/$FSID/osd.0/unit.run" I find that it's failing after looking for a block device that doesn't exist -- namely /var/lib/ceph/osd/ceph-0. This device was accurate for the ceph-deploy-built OSDs, but after 'cephadm adopt' has been run, the correct block device is '/dev/dm-1' if I'm not mistaken.

Looking at the cephadm logs, it appears this was by design as far as cephadm is concerned, however this is clearly the wrong device and so the containers fail to start.

debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  1 bluestore(/var/lib/ceph/osd/ceph-0) _mount path /var/lib/ceph/osd/ceph-0
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  0 bluestore(/var/lib/ceph/osd/ceph-0) _open_db_and_around read-only:0 repair:0
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 bluestore(/var/lib/ceph/osd/ceph-0/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0/block: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080  1 bdev(0x5642f6a9a400 /var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 bdev(0x5642f6a9a400 /var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1 osd.0 0 OSD:init: unable to mount object store
debug 2021-12-28T03:33:58.368+0000 7f4b3207c080 -1  ** ERROR: osd init failed: (13) Permission denied
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux