To run a `ceph orch...` (or really any command to the cluster) you should first open a shell with `cephadm shell`. That will put you in a bash shell inside a container that has the ceph packages matching the ceph version in your cluster. If you just want a single command rather than an interactive shell, you can also do `cephadm shell -- ceph orch...`. Also, this might not turn out to be an issue, but just thinking ahead, the devices cephadm will typically allow you to put an OSD on should match what's output by `ceph orch device ls` (which is populated by `cephadm ceph-volume -- inventory --format=json-pretty` if you want to look further). So I'd generally say to always check that before making any OSDs through the orchestrator. I also generally like to recommend setting up OSDs through drive group specs ( https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications) over using `ceph orch daemon add osd...` although that's a tangent to what you're trying to do now. On Wed, Nov 29, 2023 at 4:14 PM Francisco Arencibia Quesada < arencibia.francisco@xxxxxxxxx> wrote: > Thanks so much Adam, that worked great, however I can not add any storage > with: > > sudo cephadm ceph orch daemon add osd node2-ceph:/dev/nvme1n1 > > root@node1-ceph:~# ceph status > cluster: > id: 9d8f1112-8ef9-11ee-838e-a74e679f7866 > health: HEALTH_WARN > Failed to apply 1 service(s): osd.all-available-devices > 2 failed cephadm daemon(s) > OSD count 0 < osd_pool_default_size 3 > > services: > mon: 1 daemons, quorum node1-ceph (age 18m) > mgr: node1-ceph.jitjfd(active, since 17m) > osd: 0 osds: 0 up, 0 in (since 6m) > > data: > pools: 0 pools, 0 pgs > objects: 0 objects, 0 B > usage: 0 B used, 0 B / 0 B avail > pgs: > > root@node1-ceph:~# > > Regards > > > > On Wed, Nov 29, 2023 at 5:45 PM Adam King <adking@xxxxxxxxxx> wrote: > >> I think I remember a bug that happened when there was a small mismatch >> between the cephadm version being used for bootstrapping and the container. >> In this case, the cephadm binary used for bootstrap knows about the >> ceph-exporter service and the container image being used does not. The >> ceph-exporter was removed from quincy between 17.2.6 and 17.2.7 so I'd >> guess the cephadm binary here is a bit older and it's pulling hte 17.2.7 >> image. For now, I'd say just workaround this by running bootstrap with >> `--skip-monitoring-stack` flag. If you want the other services in the >> monitoring stack after bootstrap you can just run `ceph orch apply >> <service>` for services alertmanager, prometheus, node-exporter, and >> grafana and it would get you in the same spot as if you didn't provide the >> flag and weren't hitting the issue. >> >> For an extra note, this failed bootstrap might be leaving things around >> that could cause subsequent bootstraps to fail. If you run `cephadm ls` and >> see things listed, you can grab the fsid from the output of that command >> and run `cephadm rm-cluster --force --fsid <fsid>` to clean up the env >> before bootstrapping again. >> >> On Wed, Nov 29, 2023 at 11:32 AM Francisco Arencibia Quesada < >> arencibia.francisco@xxxxxxxxx> wrote: >> >>> Hello guys, >>> >>> This situation is driving me crazy, I have tried to deploy a ceph >>> cluster, >>> in all ways possible, even with ansible and at some point it breaks. I'm >>> using Ubuntu 22.0.4. This is one of the errors I'm having, some problem >>> with ceph-exporter. Please could you help me, I have been dealing with >>> this for like 5 days. >>> Kind regards >>> >>> root@node1-ceph:~# cephadm bootstrap --mon-ip 10.0.0.52 >>> Verifying podman|docker is present... >>> Verifying lvm2 is present... >>> Verifying time synchronization is in place... >>> Unit systemd-timesyncd.service is enabled and running >>> Repeating the final host check... >>> docker (/usr/bin/docker) is present >>> systemctl is present >>> lvcreate is present >>> Unit systemd-timesyncd.service is enabled and running >>> Host looks OK >>> Cluster fsid: 4ce3a92a-8ddd-11ee-9b23-6341187f70c1 >>> Verifying IP 10.0.0.52 port 3300 ... >>> Verifying IP 10.0.0.52 port 6789 ... >>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24> >>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24> >>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32> >>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32> >>> Internal network (--cluster-network) has not been provided, OSD >>> replication >>> will default to the public_network >>> Pulling container image quay.io/ceph/ceph:v17... >>> Ceph version: ceph version 17.2.7 >>> (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) >>> Extracting ceph user uid/gid from container image... >>> Creating initial keys... >>> Creating initial monmap... >>> Creating mon... >>> Waiting for mon to start... >>> Waiting for mon... >>> mon is available >>> Assimilating anything we can from ceph.conf... >>> Generating new minimal ceph.conf... >>> Restarting the monitor... >>> Setting mon public_network to 10.0.0.1/32,10.0.0.0/24 >>> Wrote config to /etc/ceph/ceph.conf >>> Wrote keyring to /etc/ceph/ceph.client.admin.keyring >>> Creating mgr... >>> Verifying port 9283 ... >>> Waiting for mgr to start... >>> Waiting for mgr... >>> mgr not available, waiting (1/15)... >>> mgr not available, waiting (2/15)... >>> mgr not available, waiting (3/15)... >>> mgr not available, waiting (4/15)... >>> mgr not available, waiting (5/15)... >>> mgr is available >>> Enabling cephadm module... >>> Waiting for the mgr to restart... >>> Waiting for mgr epoch 5... >>> mgr epoch 5 is available >>> Setting orchestrator backend to cephadm... >>> Generating ssh key... >>> Wrote public SSH key to /etc/ceph/ceph.pub >>> Adding key to root@localhost authorized_keys... >>> Adding host node1-ceph... >>> Deploying mon service with default placement... >>> Deploying mgr service with default placement... >>> Deploying crash service with default placement... >>> Deploying ceph-exporter service with default placement... >>> Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host >>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >>> CEPH_USE_RANDOM_NONCE=1 -v >>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >>> apply ceph-exporter >>> /usr/bin/ceph: stderr Error EINVAL: Usage: >>> /usr/bin/ceph: stderr ceph orch apply -i <yaml spec> [--dry-run] >>> /usr/bin/ceph: stderr ceph orch apply <service_type> >>> [--placement=<placement_string>] [--unmanaged] >>> /usr/bin/ceph: stderr >>> Traceback (most recent call last): >>> File "/usr/sbin/cephadm", line 9653, in <module> >>> main() >>> File "/usr/sbin/cephadm", line 9641, in main >>> r = ctx.func(ctx) >>> File "/usr/sbin/cephadm", line 2205, in _default_image >>> return func(ctx) >>> File "/usr/sbin/cephadm", line 5774, in command_bootstrap >>> prepare_ssh(ctx, cli, wait_for_mgr_restart) >>> File "/usr/sbin/cephadm", line 5275, in prepare_ssh >>> cli(['orch', 'apply', t]) >>> File "/usr/sbin/cephadm", line 5708, in cli >>> return CephContainer( >>> File "/usr/sbin/cephadm", line 4144, in run >>> out, _, _ = call_throws(self.ctx, self.run_cmd(), >>> File "/usr/sbin/cephadm", line 1853, in call_throws >>> raise RuntimeError('Failed command: %s' % ' '.join(command)) >>> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host >>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >>> CEPH_USE_RANDOM_NONCE=1 -v >>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >>> apply ceph-exporter >>> >>> -- >>> *Francisco Arencibia Quesada.* >>> *DevOps Engineer* >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >>> > > -- > *Francisco Arencibia Quesada.* > *DevOps Engineer* > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx