Re: [EXTERNAL] Re: Cephadm Offline Bootstrapping Issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The setting can technically be set up at bootstrap. Bootstrap supports a
`--config` flag that takes a filepath to a ceph config file. That file
could include

[mgr]
mgr/cephadm/use_repo_digest = false

The config file is assimilated into the cluster as part of bootstrap and
use_repo_digest will be false once bootstrap completes.

On Mon, Aug 5, 2024 at 3:42 AM Alex Hussein-Kershaw (HE/HIM) <
alexhus@xxxxxxxxxxxxx> wrote:

> Thanks! The initial email was from a month ago and I think only made its
> way onto the mailing list recently as I was having trouble getting signed
> up. Unfortunately that means I don't have a system in this state anymore.
>
> I have since got through the bootstrapping issues by running a docker
> registry on localhost following guidance here: Deploying a new Ceph
> cluster — Ceph Documentation
> <https://docs.ceph.com/en/latest/cephadm/install/#deployment-in-an-isolated-environment>.
> It does seem like a bit of overhead to have to do this when the image is
> already available locally, but is a minor annoyance compared to not being
> able to create a cluster.
>
> The repo digest thing sounds promising, but it would be shame to have to
> run bootstrapping, have it fail, apply some config, and restart a bunch of
> services to make this work. Feels like it's possibly a wider bug with
> --skip--pull flag? Unless it's possible to set config (or pass it in
> somehow) before the point of bootstrapping?
>
> I confess I don't really understand why this field is not set by the
> docker client running locally. I wonder if I can do anything on the docker
> client side to add a repo digest. I'll explore that a bit.
>
> Thanks,
> Alex
>
> ------------------------------
> *From:* Adam King <adking@xxxxxxxxxx>
> *Sent:* Friday, August 2, 2024 4:21 PM
> *To:* Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx>
> *Cc:* ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> *Subject:* [EXTERNAL] Re:  Cephadm Offline Bootstrapping Issue
>
> The thing that stands out to me from that output was that the image has no
> repo_digests. It's possible cephadm is expecting there to be digests and is
> crashing out trying to grab them for this image. I think it's worth a try
> to set mgr/cephadm/use_repo_digest to false, and then restart the mgr. FWIW
> turning off that setting has resolved other issues related to disconnected
> installs as well. It just means you should avoid using floating tags.
>
> On Thu, Aug 1, 2024 at 11:19 PM Alex Hussein-Kershaw (HE/HIM) <
> alexhus@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I'm hitting an issue doing an offline install of Ceph 18.2.2 using cephadm.
>
> Long output below... any advice is appreciated.
>
> Looks like we don't managed to add admin labels (but also trying with
> --skip-admin results in a similar health warning).
>
> Subsequently trying to add an OSD fails quietly, I assume because cephadm
> is unhappy.
>
> Thanks,
> Alex
>
> $  sudo  cephadm --image "ceph/ceph:v18.2.2" --docker bootstrap  --mon-ip
> `hostname -I` --skip-pull --ssh-user qs-admin --ssh-private-key
> /home/qs-admin/.ssh/id_rsa --ssh-public-key /home/qs-admin/.ssh/id_rsa.pub
> --skip-dashboard
> Verifying ssh connectivity using standard pubkey authentication ...
> Adding key to qs-admin@localhost authorized_keys...
> key already in qs-admin@localhost authorized_keys...
> Verifying podman|docker is present...
> Verifying lvm2 is present...
> Verifying time synchronization is in place...
> Unit chronyd.service is enabled and running
> Repeating the final host check...
> docker (/usr/bin/docker) is present
> systemctl is present
> lvcreate is present
> Unit chronyd.service is enabled and running
> Host looks OK
> Cluster fsid: 65bee110-3ae6-11ef-a1de-005056013d88
> Verifying IP 10.235.22.8 port 3300 ...
> Verifying IP 10.235.22.8 port 6789 ...
> Mon IP `10.235.22.8` is in CIDR network `10.235.16.0/20`
> <http://10.235.16.0/20>
> Mon IP `10.235.22.8` is in CIDR network `10.235.16.0/20`
> <http://10.235.16.0/20>
> Internal network (--cluster-network) has not been provided, OSD
> replication will default to the public_network
> Ceph version: ceph version 18.2.2
> (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
> Extracting ceph user uid/gid from container image...
> Creating initial keys...
> Creating initial monmap...
> Creating mon...
> Waiting for mon to start...
> Waiting for mon...
> mon is available
> Assimilating anything we can from ceph.conf...
> Generating new minimal ceph.conf...
> Restarting the monitor...
> Setting public_network to 10.235.16.0/20 in mon config section
> Wrote config to /etc/ceph/ceph.conf
> Wrote keyring to /etc/ceph/ceph.client.admin.keyring
> Creating mgr...
> Verifying port 0.0.0.0:9283 ...
> Verifying port 0.0.0.0:8765 ...
> Verifying port 0.0.0.0:8443 ...
> Waiting for mgr to start...
> Waiting for mgr...
> mgr not available, waiting (1/15)...
> mgr not available, waiting (2/15)...
> mgr not available, waiting (3/15)...
> mgr not available, waiting (4/15)...
> mgr not available, waiting (5/15)...
> mgr is available
> Enabling cephadm module...
> Waiting for the mgr to restart...
> Waiting for mgr epoch 5...
> mgr epoch 5 is available
> Setting orchestrator backend to cephadm...
> Using provided ssh keys...
> Adding key to qs-admin@localhost authorized_keys...
> key already in qs-admin@localhost authorized_keys...
> Adding host starlight-1...
> Deploying mon service with default placement...
> Deploying mgr service with default placement...
> Deploying crash service with default placement...
> Deploying ceph-exporter service with default placement...
> Deploying prometheus service with default placement...
> Deploying grafana service with default placement...
> Deploying node-exporter service with default placement...
> Deploying alertmanager service with default placement...
> Enabling client.admin keyring and conf on hosts with "admin" label
> Non-zero exit code 5 from /usr/bin/docker run --rm --ipc=host
> --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
> /usr/bin/ceph --init -e CONTAINER_IMAGE=ceph/ceph:v18.2.2 -e
> NODE_NAME=starlight-1 -e CEPH_USE_RANDOM_NONCE=1 -v
> /var/log/ceph/65bee110-3ae6-11ef-a1de-005056013d88:/var/log/ceph:z -v
> /tmp/ceph-tmpxbngx708:/etc/ceph/ceph.client.admin.keyring:z -v
> /tmp/ceph-tmp94g7iyn2:/etc/ceph/ceph.conf:z ceph/ceph:v18.2.2 orch
> client-keyring set client.admin label:_admin
> /usr/bin/ceph: stderr Error EIO: Module 'cephadm' has experienced an error
> and cannot handle commands:
> ContainerInspectInfo(image_id='3c937764e6f5de1131b469dc69f0db09f8bd55cf6c983482cde518596d3dd0e5',
> ceph_version='ceph version 18.2.2
> (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)',
> repo_digests=[''])
> Unable to set up "admin" label; assuming older version of Ceph
> Saving cluster configuration to
> /var/lib/ceph/65bee110-3ae6-11ef-a1de-005056013d88/config directory
> Enabling autotune for osd_memory_target
> You can access the Ceph CLI as following in case of multi-cluster or
> non-default config:
>
>         sudo /usr/sbin/cephadm shell --fsid
> 65bee110-3ae6-11ef-a1de-005056013d88 -c /etc/ceph/ceph.conf -k
> /etc/ceph/ceph.client.admin.keyring
>
> Or, if you are only running a single cluster on this host:
>
>         sudo /usr/sbin/cephadm shell
>
> Please consider enabling telemetry to help improve Ceph:
>
>         ceph telemetry on
>
> For more information see:
>
>         https://docs.ceph.com/en/latest/mgr/telemetry/
>
> Bootstrap complete.
>
>
> ]$ sudo docker exec
> ceph-1b19e642-3ae5-11ef-b4e4-005056013d88-mon-starlight-1 ceph -s
>   cluster:
>     id:     1b19e642-3ae5-11ef-b4e4-005056013d88
>     health: HEALTH_ERR
>             Module 'cephadm' has failed:
> ContainerInspectInfo(image_id='3c937764e6f5de1131b469dc69f0db09f8bd55cf6c983482cde518596d3dd0e5',
> ceph_version='ceph version 18.2.2
> (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)',
> repo_digests=[''])
>             OSD count 0 < osd_pool_default_size 3
>
>   services:
>     mon: 1 daemons, quorum starlight-1 (age 2m)
>     mgr: starlight-1.yhqrry(active, since 107s)
>     osd: 0 osds: 0 up, 0 in
>
>   data:
>     pools:   0 pools, 0 pgs
>     objects: 0 objects, 0 B
>     usage:   0 B used, 0 B / 0 B avail
>     pgs:
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux