Re: error deploying ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To run a `ceph orch...` (or really any command to the cluster) you should
first open a shell with `cephadm shell`. That will put you in a bash shell
inside a container that has the ceph packages matching the ceph version in
your cluster. If you just want a single command rather than an interactive
shell, you can also do `cephadm shell -- ceph orch...`. Also, this might
not turn out to be an issue, but just thinking ahead, the devices cephadm
will typically allow you to put an OSD on should match what's output by
`ceph orch device ls` (which is populated by `cephadm ceph-volume --
inventory --format=json-pretty` if you want to look further). So I'd
generally say to always check that before making any OSDs through the
orchestrator. I also generally like to recommend setting up OSDs through
drive group specs (
https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications)
over using `ceph orch daemon add osd...` although that's a tangent to what
you're trying to do now.

On Wed, Nov 29, 2023 at 4:14 PM Francisco Arencibia Quesada <
arencibia.francisco@xxxxxxxxx> wrote:

> Thanks so much Adam, that worked great, however I can not add any storage
> with:
>
> sudo cephadm ceph orch daemon add osd node2-ceph:/dev/nvme1n1
>
> root@node1-ceph:~# ceph status
>   cluster:
>     id:     9d8f1112-8ef9-11ee-838e-a74e679f7866
>     health: HEALTH_WARN
>             Failed to apply 1 service(s): osd.all-available-devices
>             2 failed cephadm daemon(s)
>             OSD count 0 < osd_pool_default_size 3
>
>   services:
>     mon: 1 daemons, quorum node1-ceph (age 18m)
>     mgr: node1-ceph.jitjfd(active, since 17m)
>     osd: 0 osds: 0 up, 0 in (since 6m)
>
>   data:
>     pools:   0 pools, 0 pgs
>     objects: 0 objects, 0 B
>     usage:   0 B used, 0 B / 0 B avail
>     pgs:
>
> root@node1-ceph:~#
>
> Regards
>
>
>
> On Wed, Nov 29, 2023 at 5:45 PM Adam King <adking@xxxxxxxxxx> wrote:
>
>> I think I remember a bug that happened when there was a small mismatch
>> between the cephadm version being used for bootstrapping and the container.
>> In this case, the cephadm binary used for bootstrap knows about the
>> ceph-exporter service and the container image being used does not. The
>> ceph-exporter was removed from quincy between 17.2.6 and 17.2.7 so I'd
>> guess the cephadm binary here is a bit older and it's pulling hte 17.2.7
>> image. For now, I'd say just workaround this by running bootstrap with
>> `--skip-monitoring-stack` flag. If you want the other services in the
>> monitoring stack after bootstrap you can just run `ceph orch apply
>> <service>` for services alertmanager, prometheus, node-exporter, and
>> grafana and it would get you in the same spot as if you didn't provide the
>> flag and weren't hitting the issue.
>>
>> For an extra note, this failed bootstrap might be leaving things around
>> that could cause subsequent bootstraps to fail. If you run `cephadm ls` and
>> see things listed, you can grab the fsid from the output of that command
>> and run `cephadm rm-cluster --force --fsid <fsid>` to clean up the env
>> before bootstrapping again.
>>
>> On Wed, Nov 29, 2023 at 11:32 AM Francisco Arencibia Quesada <
>> arencibia.francisco@xxxxxxxxx> wrote:
>>
>>> Hello guys,
>>>
>>> This situation is driving me crazy, I have tried to deploy a ceph
>>> cluster,
>>> in all ways possible, even with ansible and at some point it breaks. I'm
>>> using Ubuntu 22.0.4.  This is one of the errors I'm having, some problem
>>> with ceph-exporter.  Please could you help me, I have been dealing with
>>> this for like 5 days.
>>> Kind regards
>>>
>>>  root@node1-ceph:~# cephadm bootstrap --mon-ip 10.0.0.52
>>> Verifying podman|docker is present...
>>> Verifying lvm2 is present...
>>> Verifying time synchronization is in place...
>>> Unit systemd-timesyncd.service is enabled and running
>>> Repeating the final host check...
>>> docker (/usr/bin/docker) is present
>>> systemctl is present
>>> lvcreate is present
>>> Unit systemd-timesyncd.service is enabled and running
>>> Host looks OK
>>> Cluster fsid: 4ce3a92a-8ddd-11ee-9b23-6341187f70c1
>>> Verifying IP 10.0.0.52 port 3300 ...
>>> Verifying IP 10.0.0.52 port 6789 ...
>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24>
>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.0/24` <http://10.0.0.0/24>
>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32>
>>> Mon IP `10.0.0.52` is in CIDR network `10.0.0.1/32` <http://10.0.0.1/32>
>>> Internal network (--cluster-network) has not been provided, OSD
>>> replication
>>> will default to the public_network
>>> Pulling container image quay.io/ceph/ceph:v17...
>>> Ceph version: ceph version 17.2.7
>>> (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
>>> Extracting ceph user uid/gid from container image...
>>> Creating initial keys...
>>> Creating initial monmap...
>>> Creating mon...
>>> Waiting for mon to start...
>>> Waiting for mon...
>>> mon is available
>>> Assimilating anything we can from ceph.conf...
>>> Generating new minimal ceph.conf...
>>> Restarting the monitor...
>>> Setting mon public_network to 10.0.0.1/32,10.0.0.0/24
>>> Wrote config to /etc/ceph/ceph.conf
>>> Wrote keyring to /etc/ceph/ceph.client.admin.keyring
>>> Creating mgr...
>>> Verifying port 9283 ...
>>> Waiting for mgr to start...
>>> Waiting for mgr...
>>> mgr not available, waiting (1/15)...
>>> mgr not available, waiting (2/15)...
>>> mgr not available, waiting (3/15)...
>>> mgr not available, waiting (4/15)...
>>> mgr not available, waiting (5/15)...
>>> mgr is available
>>> Enabling cephadm module...
>>> Waiting for the mgr to restart...
>>> Waiting for mgr epoch 5...
>>> mgr epoch 5 is available
>>> Setting orchestrator backend to cephadm...
>>> Generating ssh key...
>>> Wrote public SSH key to /etc/ceph/ceph.pub
>>> Adding key to root@localhost authorized_keys...
>>> Adding host node1-ceph...
>>> Deploying mon service with default placement...
>>> Deploying mgr service with default placement...
>>> Deploying crash service with default placement...
>>> Deploying ceph-exporter service with default placement...
>>> Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host
>>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e
>>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e
>>> CEPH_USE_RANDOM_NONCE=1 -v
>>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v
>>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v
>>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch
>>> apply ceph-exporter
>>> /usr/bin/ceph: stderr Error EINVAL: Usage:
>>> /usr/bin/ceph: stderr   ceph orch apply -i <yaml spec> [--dry-run]
>>> /usr/bin/ceph: stderr   ceph orch apply <service_type>
>>> [--placement=<placement_string>] [--unmanaged]
>>> /usr/bin/ceph: stderr
>>> Traceback (most recent call last):
>>>   File "/usr/sbin/cephadm", line 9653, in <module>
>>>     main()
>>>   File "/usr/sbin/cephadm", line 9641, in main
>>>     r = ctx.func(ctx)
>>>   File "/usr/sbin/cephadm", line 2205, in _default_image
>>>     return func(ctx)
>>>   File "/usr/sbin/cephadm", line 5774, in command_bootstrap
>>>     prepare_ssh(ctx, cli, wait_for_mgr_restart)
>>>   File "/usr/sbin/cephadm", line 5275, in prepare_ssh
>>>     cli(['orch', 'apply', t])
>>>   File "/usr/sbin/cephadm", line 5708, in cli
>>>     return CephContainer(
>>>   File "/usr/sbin/cephadm", line 4144, in run
>>>     out, _, _ = call_throws(self.ctx, self.run_cmd(),
>>>   File "/usr/sbin/cephadm", line 1853, in call_throws
>>>     raise RuntimeError('Failed command: %s' % ' '.join(command))
>>> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
>>> --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e
>>> CONTAINER_IMAGE=quay.io/ceph/ceph:v17 -e NODE_NAME=node1-ceph -e
>>> CEPH_USE_RANDOM_NONCE=1 -v
>>> /var/log/ceph/4ce3a92a-8ddd-11ee-9b23-6341187f70c1:/var/log/ceph:z -v
>>> /tmp/ceph-tmp6yz3vt5s:/etc/ceph/ceph.client.admin.keyring:z -v
>>> /tmp/ceph-tmpfhd01qwu:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v17 orch
>>> apply ceph-exporter
>>>
>>> --
>>> *Francisco Arencibia Quesada.*
>>> *DevOps Engineer*
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>>
>
> --
> *Francisco Arencibia Quesada.*
> *DevOps Engineer*
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux