Thanks again guys, The cluster is healthy now, is this normal? all looks look except for this output *Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected * root@node1-ceph:~# cephadm shell -- ceph status Inferring fsid 209a7bf0-8f6d-11ee-8828-23977d76b74f Inferring config /var/lib/ceph/209a7bf0-8f6d-11ee-8828-23977d76b74f/mon.node1-ceph/config Using ceph image with id '921993c4dfd2' and tag 'v17' created on 2023-11-22 16:03:22 +0000 UTC quay.io/ceph/ceph@sha256:dad2876c2916b732d060b71320f97111bc961108f9c249f4daa9540957a2b6a2 cluster: id: 209a7bf0-8f6d-11ee-8828-23977d76b74f health: HEALTH_OK services: mon: 3 daemons, quorum node1-ceph,node2-ceph,node3-ceph (age 2h) mgr: node1-ceph.peedpx(active, since 2h), standbys: node2-ceph.ykkvho osd: 3 osds: 3 up (since 2h), 3 in (since 2h) data: pools: 2 pools, 33 pgs objects: 7 objects, 449 KiB usage: 873 MiB used, 299 GiB / 300 GiB avail pgs: 33 active+clean root@node1-ceph:~# cephadm shell -- ceph orch device ls --wide Inferring fsid 209a7bf0-8f6d-11ee-8828-23977d76b74f Inferring config /var/lib/ceph/209a7bf0-8f6d-11ee-8828-23977d76b74f/mon.node1-ceph/config Using ceph image with id '921993c4dfd2' and tag 'v17' created on 2023-11-22 16:03:22 +0000 UTC quay.io/ceph/ceph@sha256:dad2876c2916b732d060b71320f97111bc961108f9c249f4daa9540957a2b6a2 HOST PATH TYPE TRANSPORT RPM DEVICE ID SIZE HEALTH IDENT FAULT AVAILABLE REFRESHED REJECT REASONS node1-ceph /dev/xvdb ssd 100G N/A N/A No 27m ago Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected node2-ceph /dev/xvdb ssd 100G N/A N/A No 27m ago Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected node3-ceph /dev/xvdb ssd 100G N/A N/A No 27m ago Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected root@node1-ceph:~# On Wed, Nov 29, 2023 at 10:38 PM Adam King <adking@xxxxxxxxxx> wrote: > To run a `ceph orch...` (or really any command to the cluster) you should > first open a shell with `cephadm shell`. That will put you in a bash shell > inside a container that has the ceph packages matching the ceph version in > your cluster. If you just want a single command rather than an interactive > shell, you can also do `cephadm shell -- ceph orch...`. Also, this might > not turn out to be an issue, but just thinking ahead, the devices cephadm > will typically allow you to put an OSD on should match what's output by > `ceph orch device ls` (which is populated by `cephadm ceph-volume -- > inventory --format=json-pretty` if you want to look further). So I'd > generally say to always check that before making any OSDs through the > orchestrator. I also generally like to recommend setting up OSDs through > drive group specs ( > https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications) > over using `ceph orch daemon add osd...` although that's a tangent to what > you're trying to do now. > > On Wed, Nov 29, 2023 at 4:14 PM Francisco Arencibia Quesada < > arencibia.francisco@xxxxxxxxx> wrote: > >> Thanks so much Adam, that worked great, however I can not add any storage >> with: >> >> sudo cephadm ceph orch daemon add osd node2-ceph:/dev/nvme1n1 >> >> root@node1-ceph:~# ceph status >> cluster: >> id: 9d8f1112-8ef9-11ee-838e-a74e679f7866 >> health: HEALTH_WARN >> Failed to apply 1 service(s): osd.all-available-devices >> 2 failed cephadm daemon(s) >> OSD count 0 < osd_pool_default_size 3 >> >> services: >> mon: 1 daemons, quorum node1-ceph (age 18m) >> mgr: node1-ceph.jitjfd(active, since 17m) >> osd: 0 osds: 0 up, 0 in (since 6m) >> >> data: >> pools: 0 pools, 0 pgs >> objects: 0 objects, 0 B >> usage: 0 B used, 0 B / 0 B avail >> pgs: >> >> root@node1-ceph:~# >> >> Regards >> >> >> >> On Wed, Nov 29, 2023 at 5:45 PM Adam King <adking@xxxxxxxxxx> wrote: >> >>> I think I remember a bug that happened when there was a small mismatch >>> between the cephadm version being used for bootstrapping and the container. >>> In this case, the cephadm binary used for bootstrap knows about the >>> ceph-exporter service and the container image being used does not. The >>> ceph-exporter was removed from quincy between 17.2.6 and 17.2.7 so I'd >>> guess the cephadm binary here is a bit older and it's pulling hte 17.2.7 >>> image. For now, I'd say just workaround this by running bootstrap with >>> `--skip-monitoring-stack` flag. If you want the other services in the >>> monitoring stack after bootstrap you can just run `ceph orch apply >>> <service>` for services alertmanager, prometheus, node-exporter, and >>> grafana and it would get you in the same spot as if you didn't provide the >>> flag and weren't hitting the issue. >>> >>> For an extra note, this failed bootstrap might be leaving things around >>> that could cause subsequent bootstraps to fail. If you run `cephadm ls` and >>> see things listed, you can grab the fsid from the output of that command >>> and run `cephadm rm-cluster --force --fsid <fsid>` to clean up the env >>> before bootstrapping again. >>> >>> On Wed, Nov 29, 2023 at 11:32 AM Francisco Arencibia Quesada < >>> arencibia.francisco@xxxxxxxxx> wrote: >>> >>>> Hello guys, >>>> >>>> This situation is driving me crazy, I have tried to deploy a ceph >>>> cluster, >>>> in all ways possible, even with ansible and at some point it breaks. I'm >>>> using Ubuntu 22.0.4. This is one of the errors I'm having, some problem >>>> with ceph-exporter. Please could you help me, I have been dealing with >>>> this for like 5 days. >>>> Kind regards >>>> >>>> root@node1-ceph:~# cephadm bootstrap --mon-ip 10.0.0.52 >>>> Verifying podman|docker is present... >>>> Verifying lvm2 is present... >>>> Verifying time synchronization is in place... >>>> Unit systemd-timesyncd.service is enabled and running >>>> Repeating the final host check... >>>> docker (/usr/bin/docker) is present >>>> systemctl is present >>>> lvcreate is present >>>> Unit systemd-timesyncd.service is enabled and running >>>> Host looks OK >>>> Cluster fsid: 4ce3a92a-8ddd-11ee-9b23-6341187f70c1 >>>> Verifying IP 10.0.0.52 port 3300 ... >>>> Verifying IP 10.0.0.52 port 6789 ... >>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` >>>> <http://10.0.0.0/24> >>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` >>>> <http://10.0.0.0/24> >>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` >>>> <http://10.0.0.1/32> >>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` >>>> <http://10.0.0.1/32> >>>> Internal network (--cluster-network) has not been provided, OSD >>>> replication >>>> will default to the public_network >>>> Pulling container image quay.io/ceph/ceph:v17... >>>> Ceph version: ceph version 17.2.7 >>>> (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) >>>> Extracting ceph user uid/gid from container image... >>>> Creating initial keys... >>>> Creating initial monmap... >>>> Creating mon... >>>> Waiting for mon to start... >>>> Waiting for mon... >>>> mon is available >>>> Assimilating anything we can from ceph.conf... >>>> Generating new minimal ceph.conf... >>>> Restarting the monitor... >>>> Setting mon public_network to 10.0.0.1/32,10.0.0.0/24 >>>> Wrote config to /etc/ceph/ceph.conf >>>> Wrote keyring to /etc/ceph/ceph.client.admin.keyring >>>> Creating mgr... >>>> Verifying port 9283 ... >>>> Waiting for mgr to start... >>>> Waiting for mgr... >>>> mgr not available, waiting (1/15)... >>>> mgr not available, waiting (2/15)... >>>> mgr not available, waiting (3/15)... >>>> mgr not available, waiting (4/15)... >>>> mgr not available, waiting (5/15)... >>>> mgr is available >>>> Enabling cephadm module... >>>> Waiting for the mgr to restart... >>>> Waiting for mgr epoch 5... >>>> mgr epoch 5 is available >>>> Setting orchestrator backend to cephadm... >>>> Generating ssh key... >>>> Wrote public SSH key to /etc/ceph/ceph.pub >>>> Adding key to root@localhost authorized_keys... >>>> Adding host node1-ceph... >>>> Deploying mon service with default placement... >>>> Deploying mgr service with default placement... >>>> Deploying crash service with default placement... >>>> Deploying ceph-exporter service with default placement... >>>> Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host >>>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >>>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >>>> CEPH_USE_RANDOM_NONCE=1 -v >>>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >>>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >>>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >>>> apply ceph-exporter >>>> /usr/bin/ceph: stderr Error EINVAL: Usage: >>>> /usr/bin/ceph: stderr ceph orch apply -i <yaml spec> [--dry-run] >>>> /usr/bin/ceph: stderr ceph orch apply <service_type> >>>> [--placement=<placement_string>] [--unmanaged] >>>> /usr/bin/ceph: stderr >>>> Traceback (most recent call last): >>>> File "/usr/sbin/cephadm", line 9653, in <module> >>>> main() >>>> File "/usr/sbin/cephadm", line 9641, in main >>>> r = ctx.func(ctx) >>>> File "/usr/sbin/cephadm", line 2205, in _default_image >>>> return func(ctx) >>>> File "/usr/sbin/cephadm", line 5774, in command_bootstrap >>>> prepare_ssh(ctx, cli, wait_for_mgr_restart) >>>> File "/usr/sbin/cephadm", line 5275, in prepare_ssh >>>> cli(['orch', 'apply', t]) >>>> File "/usr/sbin/cephadm", line 5708, in cli >>>> return CephContainer( >>>> File "/usr/sbin/cephadm", line 4144, in run >>>> out, _, _ = call_throws(self.ctx, self.run_cmd(), >>>> File "/usr/sbin/cephadm", line 1853, in call_throws >>>> raise RuntimeError('Failed command: %s' % ' '.join(command)) >>>> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host >>>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e >>>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e >>>> CEPH_USE_RANDOM_NONCE=1 -v >>>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v >>>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v >>>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch >>>> apply ceph-exporter >>>> >>>> -- >>>> *Francisco Arencibia Quesada.* >>>> *DevOps Engineer* >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>>> >> >> -- >> *Francisco Arencibia Quesada.* >> *DevOps Engineer* >> > -- *Francisco Arencibia Quesada.* *DevOps Engineer* _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx