Can you verify that the `/usr/lib/sysctl.d/` folder exists on your
debian machines?
Am 01.09.21 um 15:19 schrieb Alcatraz:
Sebastian,
I appreciate all your help. I actually (out of desperation) spun up
another cluster, same specs, just using Ubuntu 18.04 rather than
Debian 10. All the OSDs were recognized, and all went up/in without
issue.
Thanks
On 9/1/21 06:15, Sebastian Wagner wrote:
Am 30.08.21 um 17:39 schrieb Alcatraz:
Sebastian,
Thanks for responding! And of course.
1. ceph orch ls --service-type osd --format yaml
Output:
service_type: osd
service_id: all-available-devices
service_name: osd.all-available-devices
placement:
host_pattern: '*'
unmanaged: true
spec:
data_devices:
all: true
filter_logic: AND
objectstore: bluestore
status:
created: '2021-08-30T13:57:51.000178Z'
last_refresh: '2021-08-30T15:24:10.534710Z'
running: 0
size: 6
events:
- 2021-08-30T03:48:01.652108Z service:osd.all-available-devices
[INFO] "service was
created"
- "2021-08-30T03:49:00.267808Z service:osd.all-available-devices
[ERROR] \"Failed\
\ to apply: cephadm exited with an error code: 1, stderr:Non-zero
exit code 1 from\
\ /usr/bin/docker container inspect --format {{.State.Status}}
ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
/usr/bin/docker: stdout \n/usr/bin/docker: stderr Error: No such
container: ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\
Deploy daemon osd.0 ...\nTraceback (most recent call last):\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 8230, in <module>\n main()\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 8218, in main\n r = ctx.func(ctx)\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 1759, in _default_image\n return func(ctx)\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 4326, in command_deploy\n ports=daemon_ports)\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 2632, in deploy_daemon\n c, osd_fsid=osd_fsid,
ports=ports)\n File \"\
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 2801, in deploy_daemon_units\n install_sysctl(ctx, fsid,
daemon_type)\n\
\ File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 2963, in install_sysctl\n _write(conf, lines)\n File
\"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\
, line 2948, in _write\n with open(conf, 'w') as
f:\nFileNotFoundError: [Errno\
\ 2] No such file or directory:
'/usr/lib/sysctl.d/90-ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.conf'\""
https://tracker.ceph.com/issues/52481
- '2021-08-30T03:49:08.356762Z service:osd.all-available-devices
[ERROR] "Failed to
apply: auth get failed: failed to find osd.0 in keyring retval: -2"'
- '2021-08-30T03:52:34.100977Z service:osd.all-available-devices
[ERROR] "Failed to
apply: auth get failed: failed to find osd.3 in keyring retval: -2"'
- '2021-08-30T03:52:42.260439Z service:osd.all-available-devices
[ERROR] "Failed to
apply: auth get failed: failed to find osd.6 in keyring retval: -2"'
Will be fixed by https://github.com/ceph/ceph/pull/42989
2. ceph orch ps --daemon-type osd --format yaml
Output: ...snip...
3. ceph auth add osd.0 osd 'allow *' mon 'allow rwx' -i
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring
I verified
/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring
file does exist.
Output:
Error EINVAL: caps cannot be specified both in keyring and in command
You only need to create the keyring, you don't need to store the
keyring anywhere. I'd still suggest to somehow create the keyring,
but I haven't seen this particular error before.
hth
Sebastian
Thanks
On 8/30/21 10:28, Sebastian Wagner wrote:
Could you run
1. ceph orch ls --service-type osd --format yaml
2. cpeh orch ps --daemon-type osd --format yaml
3. try running the `ceph auth add` call form
https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/#adding-an-osd-manual
Am 30.08.21 um 14:49 schrieb Alcatraz:
Hello all,
Running into some issues trying to build a virtual PoC for Ceph.
Went to my cloud provider of choice and spun up some nodes. I have
three identical hosts consisting of:
Debian 10
8 cpu cores
16GB RAM
1x315GB Boot Drive
3x400GB Data drives
After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and
logging into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I
thought perhaps it just needed some time to bring up the OSDs, so
I left it running overnight.
This morning, I checked, and the Ceph dashboard shows 9 OSDs, 0
up, 6 in, 3 out. I find this odd, as it hasn't been touched since
it was deployed. Ceph health shows "HEALTH_OK", `ceph osd tree`
outputs:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
0 0 osd.0 down 0 1.00000
1 0 osd.1 down 0 1.00000
2 0 osd.2 down 0 1.00000
3 0 osd.3 down 1.00000 1.00000
4 0 osd.4 down 1.00000 1.00000
5 0 osd.5 down 1.00000 1.00000
6 0 osd.6 down 1.00000 1.00000
7 0 osd.7 down 1.00000 1.00000
8 0 osd.8 down 1.00000 1.00000
and if I run `ls /var/run/ceph` the only thing it outputs is
"d1405594-0944-11ec-8ebc-f23c92edc936" (sans quotes), which I
assume is the cluster ID? So of course, if I run `ceph daemon
osd.8 help` for example, it just returns:
Can't get admin socket path: unable to get conf option
admin_socket for osd: b"error parsing 'osd': expected string of
the form TYPE.ID, valid types are: auth, mon, osd, mds, mgr,
client\n"
If I look at the log within the Ceph dashboard, no errors or
warnings appear. Will Ceph not work on virtual hardware? Is there
something I need to do to bring up the OSDs?
Just as I was about to send this email I went to check the logs
and it shows the following (traceback ommited for length):
8/30/21 7:44:15 AM[ERR]Failed to apply osd.all-available-devices
spec
DriveGroupSpec(name=all-available-devices->placement=PlacementSpec(host_pattern='*'),
service_id='all-available-devices', service_type='osd',
data_devices=DeviceSelection(all=True), osd_id_claims={},
unmanaged=False, filter_logic='AND', preview_only=False): auth get
failed: failed to find osd.6 in keyring retval: -2
8/30/21 7:45:19 AM[ERR]executing create_from_spec_one(([('ceph01',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a930bf98>), ('ceph02',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a81ac8d0>), ('ceph03',
<ceph.deployment.drive_selection.selector.DriveSelection object at
0x7f63a930b0b8>)],)) failed.
and similar for the other OSDs. I'm not sure why it's complaining
about auth, because in order to even add the hosts to the cluster
I had to copy the ceph public key to the hosts to begin with.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx