Brand New Cephadm Deployment, OSDs show either in/down or out/down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,

Running into some issues trying to build a virtual PoC for Ceph. Went to my cloud provider of choice and spun up some nodes. I have three identical hosts consisting of:

Debian 10
8 cpu cores
16GB RAM
1x315GB Boot Drive
3x400GB Data drives

After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and logging into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I thought perhaps it just needed some time to bring up the OSDs, so I left it running overnight.

This morning, I checked, and the Ceph dashboard shows 9 OSDs, 0 up, 6 in, 3 out. I find this odd, as it hasn't been touched since it was deployed. Ceph health shows "HEALTH_OK", `ceph osd tree` outputs:

ID  CLASS  WEIGHT  TYPE NAME     STATUS  REWEIGHT  PRI-AFF
-1              0  root default
 0              0  osd.0           down         0  1.00000
 1              0  osd.1           down         0  1.00000
 2              0  osd.2           down         0  1.00000
 3              0  osd.3           down   1.00000  1.00000
 4              0  osd.4           down   1.00000  1.00000
 5              0  osd.5           down   1.00000  1.00000
 6              0  osd.6           down   1.00000  1.00000
 7              0  osd.7           down   1.00000  1.00000
 8              0  osd.8           down   1.00000  1.00000

and if I run `ls /var/run/ceph` the only thing it outputs is "d1405594-0944-11ec-8ebc-f23c92edc936" (sans quotes), which I assume is the cluster ID? So of course, if I run `ceph daemon osd.8 help` for example, it just returns:

Can't get admin socket path: unable to get conf option admin_socket for osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid types are: auth, mon, osd, mds, mgr, client\n"

If I look at the log within the Ceph dashboard, no errors or warnings appear. Will Ceph not work on virtual hardware? Is there something I need to do to bring up the OSDs?

Just as I was about to send this email I went to check the logs and it shows the following (traceback ommited for length):

8/30/21 7:44:15 AM[ERR]Failed to apply osd.all-available-devices spec DriveGroupSpec(name=all-available-devices->placement=PlacementSpec(host_pattern='*'), service_id='all-available-devices', service_type='osd', data_devices=DeviceSelection(all=True), osd_id_claims={}, unmanaged=False, filter_logic='AND', preview_only=False): auth get failed: failed to find osd.6 in keyring retval: -2

8/30/21 7:45:19 AM[ERR]executing create_from_spec_one(([('ceph01', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f63a930bf98>), ('ceph02', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f63a81ac8d0>), ('ceph03', <ceph.deployment.drive_selection.selector.DriveSelection object at 0x7f63a930b0b8>)],)) failed.

and similar for the other OSDs. I'm not sure why it's complaining about auth, because in order to even add the hosts to the cluster I had to copy the ceph public key to the hosts to begin with.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux