Thanks so much Adam, that worked great, however I can not add any storage with: sudo cephadm ceph orch daemon add osd node2-ceph:/dev/nvme1n1 root@node1-ceph:~# ceph status cluster: id: 9d8f1112-8ef9-11ee-838e-a74e679f7866 health: HEALTH_WARN Failed to apply 1 service(s): osd.all-available-devices 2 failed cephadm daemon(s) OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum node1-ceph (age 18m) mgr: node1-ceph.jitjfd(active, since 17m) osd: 0 osds: 0 up, 0 in (since 6m) data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: root@node1-ceph:~# Regards On Wed, Nov 29, 2023 at 5:45 PM Adam King <adking@xxxxxxxxxx> wrote: > I think I remember a bug that happened when there was a small mismatch > between the cephadm version being used for bootstrapping and the container. > In this case, the cephadm binary used for bootstrap knows about the > ceph-exporter service and the container image being used does not. The > ceph-exporter was removed from quincy between 17.2.6 and 17.2.7 so I'd > guess the cephadm binary here is a bit older and it's pulling hte 17.2.7 > image. For now, I'd say just workaround this by running bootstrap with > `--skip-monitoring-stack` flag. If you want the other services in the > monitoring stack after bootstrap you can just run `ceph orch apply > <service>` for services alertmanager, prometheus, node-exporter, and > grafana and it would get you in the same spot as if you didn't provide the > flag and weren't hitting the issue. > > For an extra note, this failed bootstrap might be leaving things around > that could cause subsequent bootstraps to fail. If you run `cephadm ls` and > see things listed, you can grab the fsid from the output of that command > and run `cephadm rm-cluster --force --fsid <fsid>` to clean up the env > before bootstrapping again. > > On Wed, Nov 29, 2023 at 11:32 AM Francisco Arencibia Quesada < > arencibia.francisco@xxxxxxxxx> wrote: > >> Hello guys, >> >> This situation is driving me crazy, I have tried to deploy a ceph cluster, >> in all ways possible, even with ansible and at some point it breaks. I'm >> using Ubuntu 22.0.4. This is one of the errors I'm having, some problem >> with ceph-exporter. Please could you help me, I have been dealing with >> this for like 5 days. >> Kind regards >> >> root@node1-ceph:~# cephadm bootstrap --mon-ip 10.0.0.52 >> Verifying podman|docker is present... >> Verifying lvm2 is present... >> Verifying time synchronization is in place... >> Unit systemd-timesyncd.service is enabled and running >> Repeating the final host check... >> docker (/usr/bin/docker) is present >> systemctl is present >> lvcreate is present >> Unit systemd-timesyncd.service is enabled and running >> Host looks OK >> Cluster fsid: 4ce3a92a-8ddd-11ee-9b23-6341187f70c1 >> Verifying IP 10.0.0.52 port 3300 ... >> Verifying IP 10.0.0.52 port 6789 ... >> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24> >> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24> >> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32> >> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32> >> Internal network (--cluster-network) has not been provided, OSD >> replication >> will default to the public_network >> Pulling container image quay.io/ceph/ceph:v17... >> Ceph version: ceph version 17.2.7 >> (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) >> Extracting ceph user uid/gid from container image... >> Creating initial keys... >> Creating initial monmap... >> Creating mon... >> Waiting for mon to start... >> Waiting for mon... >> mon is available >> Assimilating anything we can from ceph.conf... >> Generating new minimal ceph.conf... >> Restarting the monitor... >> Setting mon public_network to 10.0.0.1/32,10.0.0.0/24 >> Wrote config to /etc/ceph/ceph.conf >> Wrote keyring to /etc/ceph/ceph.client.admin.keyring >> Creating mgr... >> Verifying port 9283 ... >> Waiting for mgr to start... >> Waiting for mgr... >> mgr not available, waiting (1/15)... >> mgr not available, waiting (2/15)... >> mgr not available, waiting (3/15)... >> mgr not available, waiting (4/15)... >> mgr not available, waiting (5/15)... >> mgr is available >> Enabling cephadm module... >> Waiting for the mgr to restart... >> Waiting for mgr epoch 5... >> mgr epoch 5 is available >> Setting orchestrator backend to cephadm... >> Generating ssh key... >> Wrote public SSH key to /etc/ceph/ceph.pub >> Adding key to root@localhost authorized_keys... >> Adding host node1-ceph... >> Deploying mon service with default placement... >> Deploying mgr service with default placement... >> Deploying crash service with default placement... >> Deploying ceph-exporter service with default placement... >> Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host >> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >> CEPH_USE_RANDOM_NONCE=1 -v >> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >> apply ceph-exporter >> /usr/bin/ceph: stderr Error EINVAL: Usage: >> /usr/bin/ceph: stderr ceph orch apply -i <yaml spec> [--dry-run] >> /usr/bin/ceph: stderr ceph orch apply <service_type> >> [--placement=<placement_string>] [--unmanaged] >> /usr/bin/ceph: stderr >> Traceback (most recent call last): >> File "/usr/sbin/cephadm", line 9653, in <module> >> main() >> File "/usr/sbin/cephadm", line 9641, in main >> r = ctx.func(ctx) >> File "/usr/sbin/cephadm", line 2205, in _default_image >> return func(ctx) >> File "/usr/sbin/cephadm", line 5774, in command_bootstrap >> prepare_ssh(ctx, cli, wait_for_mgr_restart) >> File "/usr/sbin/cephadm", line 5275, in prepare_ssh >> cli(['orch', 'apply', t]) >> File "/usr/sbin/cephadm", line 5708, in cli >> return CephContainer( >> File "/usr/sbin/cephadm", line 4144, in run >> out, _, _ = call_throws(self.ctx, self.run_cmd(), >> File "/usr/sbin/cephadm", line 1853, in call_throws >> raise RuntimeError('Failed command: %s' % ' '.join(command)) >> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host >> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >> CEPH_USE_RANDOM_NONCE=1 -v >> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >> apply ceph-exporter >> >> -- >> *Francisco Arencibia Quesada.* >> *DevOps Engineer* >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> -- *Francisco Arencibia Quesada.* *DevOps Engineer* _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx