Hi, Thanks you Sebastian, create the folder /usr/lib/sysctl.d fix the bug ! So, it's a Debian specific bug 'Jof Le jeu. 2 sept. 2021 à 10:52, Sebastian Wagner <sewagner@xxxxxxxxxx> a écrit : > Can you verify that the `/usr/lib/sysctl.d/` folder exists on your > debian machines? > > Am 01.09.21 um 15:19 schrieb Alcatraz: > > Sebastian, > > > > > > I appreciate all your help. I actually (out of desperation) spun up > > another cluster, same specs, just using Ubuntu 18.04 rather than > > Debian 10. All the OSDs were recognized, and all went up/in without > > issue. > > > > > > Thanks > > > > On 9/1/21 06:15, Sebastian Wagner wrote: > >> Am 30.08.21 um 17:39 schrieb Alcatraz: > >>> Sebastian, > >>> > >>> > >>> Thanks for responding! And of course. > >>> > >>> > >>> 1. ceph orch ls --service-type osd --format yaml > >>> > >>> Output: > >>> > >>> service_type: osd > >>> service_id: all-available-devices > >>> service_name: osd.all-available-devices > >>> placement: > >>> host_pattern: '*' > >>> unmanaged: true > >>> spec: > >>> data_devices: > >>> all: true > >>> filter_logic: AND > >>> objectstore: bluestore > >>> status: > >>> created: '2021-08-30T13:57:51.000178Z' > >>> last_refresh: '2021-08-30T15:24:10.534710Z' > >>> running: 0 > >>> size: 6 > >>> events: > >>> - 2021-08-30T03:48:01.652108Z service:osd.all-available-devices > >>> [INFO] "service was > >>> created" > >>> - "2021-08-30T03:49:00.267808Z service:osd.all-available-devices > >>> [ERROR] \"Failed\ > >>> \ to apply: cephadm exited with an error code: 1, stderr:Non-zero > >>> exit code 1 from\ > >>> \ /usr/bin/docker container inspect --format {{.State.Status}} > >>> ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\ > >>> /usr/bin/docker: stdout \n/usr/bin/docker: stderr Error: No such > >>> container: ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.0\n\ > >>> Deploy daemon osd.0 ...\nTraceback (most recent call last):\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > >>> , line 8230, in <module>\n main()\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > >>> , line 8218, in main\n r = ctx.func(ctx)\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > >>> , line 1759, in _default_image\n return func(ctx)\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > > >>> > >>> , line 4326, in command_deploy\n ports=daemon_ports)\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > > >>> > >>> , line 2632, in deploy_daemon\n c, osd_fsid=osd_fsid, > >>> ports=ports)\n File \"\ > >>> > /var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > > >>> > >>> , line 2801, in deploy_daemon_units\n install_sysctl(ctx, fsid, > >>> daemon_type)\n\ > >>> \ File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > >>> , line 2963, in install_sysctl\n _write(conf, lines)\n File > >>> > \"/var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/cephadm.d4237e4639c108308fe13147b1c08af93c3d5724d9ff21ae797eb4b78fea3931\"\ > > >>> > >>> , line 2948, in _write\n with open(conf, 'w') as > >>> f:\nFileNotFoundError: [Errno\ > >>> \ 2] No such file or directory: > >>> > '/usr/lib/sysctl.d/90-ceph-d1405594-0944-11ec-8ebc-f23c92edc936-osd.conf'\"" > >> > >> https://tracker.ceph.com/issues/52481 > >> > >>> - '2021-08-30T03:49:08.356762Z service:osd.all-available-devices > >>> [ERROR] "Failed to > >>> apply: auth get failed: failed to find osd.0 in keyring retval: -2"' > >>> - '2021-08-30T03:52:34.100977Z service:osd.all-available-devices > >>> [ERROR] "Failed to > >>> apply: auth get failed: failed to find osd.3 in keyring retval: -2"' > >>> - '2021-08-30T03:52:42.260439Z service:osd.all-available-devices > >>> [ERROR] "Failed to > >>> apply: auth get failed: failed to find osd.6 in keyring retval: -2"' > >> > >> Will be fixed by https://github.com/ceph/ceph/pull/42989 > >> > >> > >>> > >>> > >>> 2. ceph orch ps --daemon-type osd --format yaml > >>> > >>> Output: ...snip... > >>> > >>> 3. ceph auth add osd.0 osd 'allow *' mon 'allow rwx' -i > >>> /var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring > >>> > >>> I verified > >>> /var/lib/ceph/d1405594-0944-11ec-8ebc-f23c92edc936/osd.0/keyring > >>> file does exist. > >>> > >>> Output: > >>> > >>> Error EINVAL: caps cannot be specified both in keyring and in command > >> > >> > >> You only need to create the keyring, you don't need to store the > >> keyring anywhere. I'd still suggest to somehow create the keyring, > >> but I haven't seen this particular error before. > >> > >> > >> hth > >> > >> Sebastian > >> > >> > >>> > >>> Thanks > >>> > >>> On 8/30/21 10:28, Sebastian Wagner wrote: > >>>> Could you run > >>>> > >>>> 1. ceph orch ls --service-type osd --format yaml > >>>> > >>>> 2. cpeh orch ps --daemon-type osd --format yaml > >>>> > >>>> 3. try running the `ceph auth add` call form > >>>> > https://docs.ceph.com/en/mimic/rados/operations/add-or-rm-osds/#adding-an-osd-manual > >>>> > >>>> > >>>> > >>>> Am 30.08.21 um 14:49 schrieb Alcatraz: > >>>>> Hello all, > >>>>> > >>>>> Running into some issues trying to build a virtual PoC for Ceph. > >>>>> Went to my cloud provider of choice and spun up some nodes. I have > >>>>> three identical hosts consisting of: > >>>>> > >>>>> Debian 10 > >>>>> 8 cpu cores > >>>>> 16GB RAM > >>>>> 1x315GB Boot Drive > >>>>> 3x400GB Data drives > >>>>> > >>>>> After deploying Ceph (v 16.2.5) using cephadm, adding hosts, and > >>>>> logging into the dashboard, Ceph showed 9 OSDs, 0 up, 9 in. I > >>>>> thought perhaps it just needed some time to bring up the OSDs, so > >>>>> I left it running overnight. > >>>>> > >>>>> This morning, I checked, and the Ceph dashboard shows 9 OSDs, 0 > >>>>> up, 6 in, 3 out. I find this odd, as it hasn't been touched since > >>>>> it was deployed. Ceph health shows "HEALTH_OK", `ceph osd tree` > >>>>> outputs: > >>>>> > >>>>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > >>>>> -1 0 root default > >>>>> 0 0 osd.0 down 0 1.00000 > >>>>> 1 0 osd.1 down 0 1.00000 > >>>>> 2 0 osd.2 down 0 1.00000 > >>>>> 3 0 osd.3 down 1.00000 1.00000 > >>>>> 4 0 osd.4 down 1.00000 1.00000 > >>>>> 5 0 osd.5 down 1.00000 1.00000 > >>>>> 6 0 osd.6 down 1.00000 1.00000 > >>>>> 7 0 osd.7 down 1.00000 1.00000 > >>>>> 8 0 osd.8 down 1.00000 1.00000 > >>>>> > >>>>> and if I run `ls /var/run/ceph` the only thing it outputs is > >>>>> "d1405594-0944-11ec-8ebc-f23c92edc936" (sans quotes), which I > >>>>> assume is the cluster ID? So of course, if I run `ceph daemon > >>>>> osd.8 help` for example, it just returns: > >>>>> > >>>>> Can't get admin socket path: unable to get conf option > >>>>> admin_socket for osd: b"error parsing 'osd': expected string of > >>>>> the form TYPE.ID, valid types are: auth, mon, osd, mds, mgr, > >>>>> client\n" > >>>>> > >>>>> If I look at the log within the Ceph dashboard, no errors or > >>>>> warnings appear. Will Ceph not work on virtual hardware? Is there > >>>>> something I need to do to bring up the OSDs? > >>>>> > >>>>> Just as I was about to send this email I went to check the logs > >>>>> and it shows the following (traceback ommited for length): > >>>>> > >>>>> 8/30/21 7:44:15 AM[ERR]Failed to apply osd.all-available-devices > >>>>> spec > >>>>> > DriveGroupSpec(name=all-available-devices->placement=PlacementSpec(host_pattern='*'), > > >>>>> service_id='all-available-devices', service_type='osd', > >>>>> data_devices=DeviceSelection(all=True), osd_id_claims={}, > >>>>> unmanaged=False, filter_logic='AND', preview_only=False): auth get > >>>>> failed: failed to find osd.6 in keyring retval: -2 > >>>>> > >>>>> 8/30/21 7:45:19 AM[ERR]executing create_from_spec_one(([('ceph01', > >>>>> <ceph.deployment.drive_selection.selector.DriveSelection object at > >>>>> 0x7f63a930bf98>), ('ceph02', > >>>>> <ceph.deployment.drive_selection.selector.DriveSelection object at > >>>>> 0x7f63a81ac8d0>), ('ceph03', > >>>>> <ceph.deployment.drive_selection.selector.DriveSelection object at > >>>>> 0x7f63a930b0b8>)],)) failed. > >>>>> > >>>>> and similar for the other OSDs. I'm not sure why it's complaining > >>>>> about auth, because in order to even add the hosts to the cluster > >>>>> I had to copy the ceph public key to the hosts to begin with. > >>>>> > >>>>> _______________________________________________ > >>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>> > >> > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx