Hello all,
Running into some issues trying to build a virtual PoC for Ceph. Went to
my cloud provider of choice and spun up some nodes. I have three
identical hosts consisting of:
Debian 10
8 cpu cores
16GB RAM
1x315GB Boot Drive
3x400GB Data drives
After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and logging
into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I thought perhaps it
just needed some time to bring up the OSDs, so I left it running overnight.
This morning, I checked, and the Ceph dashboard shows 9 OSDs, 0 up, 6
in, 3 out. I find this odd, as it hasn't been touched since it was
deployed. Ceph health shows "HEALTH_OK", `ceph osd tree` outputs:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
0 0 osd.0 down 0 1.00000
1 0 osd.1 down 0 1.00000
2 0 osd.2 down 0 1.00000
3 0 osd.3 down 1.00000 1.00000
4 0 osd.4 down 1.00000 1.00000
5 0 osd.5 down 1.00000 1.00000
6 0 osd.6 down 1.00000 1.00000
7 0 osd.7 down 1.00000 1.00000
8 0 osd.8 down 1.00000 1.00000
and if I run `ls /var/run/ceph` the only thing it outputs is
"d1405594-0944-11ec-8ebc-f23c92edc936" (sans quotes), which I assume is
the cluster ID? So of course, if I run `ceph daemon osd.8 help` for
example, it just returns:
Can't get admin socket path: unable to get conf option admin_socket for
osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
types are: auth, mon, osd, mds, mgr, client\n"
If I look at the log within the Ceph dashboard, no errors or warnings
appear. Will Ceph not work on virtual hardware? Is there something I
need to do to bring up the OSDs?
Just as I was about to send this email I went to check the logs and it
shows the following (traceback ommited for length):
8/30/21 7:44:15 AM[ERR]Failed to apply osd.all-available-devices spec
DriveGroupSpec(name=all-available-devices->placement=PlacementSpec(host_pattern='*'),
service_id='all-available-devices', service_type='osd',
data_devices=DeviceSelection(all=True), osd_id_claims={},
unmanaged=False, filter_logic='AND', preview_only=False): auth get
failed: failed to find osd.6 in keyring retval: -2
8/30/21 7:45:19 AM[ERR]executing create_from_spec_one(([('ceph01',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a930bf98>), ('ceph02',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a81ac8d0>), ('ceph03',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a930b0b8>)],)) failed.
and similar for the other OSDs. I'm not sure why it's complaining about
auth, because in order to even add the hosts to the cluster I had to
copy the ceph public key to the hosts to begin with.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx