Thanks! The initial email was from a month ago and I think only made its way onto the mailing list recently as I was having trouble getting signed up. Unfortunately that means I don't have a system in this state anymore. I have since got through the bootstrapping issues by running a docker registry on localhost following guidance here: Deploying a new Ceph cluster — Ceph Documentation<https://docs.ceph.com/en/latest/cephadm/install/#deployment-in-an-isolated-environment>. It does seem like a bit of overhead to have to do this when the image is already available locally, but is a minor annoyance compared to not being able to create a cluster. The repo digest thing sounds promising, but it would be shame to have to run bootstrapping, have it fail, apply some config, and restart a bunch of services to make this work. Feels like it's possibly a wider bug with --skip--pull flag? Unless it's possible to set config (or pass it in somehow) before the point of bootstrapping? I confess I don't really understand why this field is not set by the docker client running locally. I wonder if I can do anything on the docker client side to add a repo digest. I'll explore that a bit. Thanks, Alex ________________________________ From: Adam King <adking@xxxxxxxxxx> Sent: Friday, August 2, 2024 4:21 PM To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: [EXTERNAL] Re: Cephadm Offline Bootstrapping Issue The thing that stands out to me from that output was that the image has no repo_digests. It's possible cephadm is expecting there to be digests and is crashing out trying to grab them for this image. I think it's worth a try to set mgr/cephadm/use_repo_digest to false, and then restart the mgr. FWIW turning off that setting has resolved other issues related to disconnected installs as well. It just means you should avoid using floating tags. On Thu, Aug 1, 2024 at 11:19 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote: Hi, I'm hitting an issue doing an offline install of Ceph 18.2.2 using cephadm. Long output below... any advice is appreciated. Looks like we don't managed to add admin labels (but also trying with --skip-admin results in a similar health warning). Subsequently trying to add an OSD fails quietly, I assume because cephadm is unhappy. Thanks, Alex $ sudo cephadm --image "ceph/ceph:v18.2.2" --docker bootstrap --mon-ip `hostname -I` --skip-pull --ssh-user qs-admin --ssh-private-key /home/qs-admin/.ssh/id_rsa --ssh-public-key /home/qs-admin/.ssh/id_rsa.pub --skip-dashboard Verifying ssh connectivity using standard pubkey authentication ... Adding key to qs-admin@localhost authorized_keys... key already in qs-admin@localhost authorized_keys... Verifying podman|docker is present... Verifying lvm2 is present... Verifying time synchronization is in place... Unit chronyd.service is enabled and running Repeating the final host check... docker (/usr/bin/docker) is present systemctl is present lvcreate is present Unit chronyd.service is enabled and running Host looks OK Cluster fsid: 65bee110-3ae6-11ef-a1de-005056013d88 Verifying IP 10.235.22.8 port 3300 ... Verifying IP 10.235.22.8 port 6789 ... Mon IP `10.235.22.8` is in CIDR network `10.235.16.0/20`<http://10.235.16.0/20> Mon IP `10.235.22.8` is in CIDR network `10.235.16.0/20`<http://10.235.16.0/20> Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network Ceph version: ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) Extracting ceph user uid/gid from container image... Creating initial keys... Creating initial monmap... Creating mon... Waiting for mon to start... Waiting for mon... mon is available Assimilating anything we can from ceph.conf... Generating new minimal ceph.conf... Restarting the monitor... Setting public_network to 10.235.16.0/20<http://10.235.16.0/20> in mon config section Wrote config to /etc/ceph/ceph.conf Wrote keyring to /etc/ceph/ceph.client.admin.keyring Creating mgr... Verifying port 0.0.0.0:9283<http://0.0.0.0:9283/> ... Verifying port 0.0.0.0:8765<http://0.0.0.0:8765/> ... Verifying port 0.0.0.0:8443<http://0.0.0.0:8443/> ... Waiting for mgr to start... Waiting for mgr... mgr not available, waiting (1/15)... mgr not available, waiting (2/15)... mgr not available, waiting (3/15)... mgr not available, waiting (4/15)... mgr not available, waiting (5/15)... mgr is available Enabling cephadm module... Waiting for the mgr to restart... Waiting for mgr epoch 5... mgr epoch 5 is available Setting orchestrator backend to cephadm... Using provided ssh keys... Adding key to qs-admin@localhost authorized_keys... key already in qs-admin@localhost authorized_keys... Adding host starlight-1... Deploying mon service with default placement... Deploying mgr service with default placement... Deploying crash service with default placement... Deploying ceph-exporter service with default placement... Deploying prometheus service with default placement... Deploying grafana service with default placement... Deploying node-exporter service with default placement... Deploying alertmanager service with default placement... Enabling client.admin keyring and conf on hosts with "admin" label Non-zero exit code 5 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=ceph/ceph:v18.2.2 -e NODE_NAME=starlight-1 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/65bee110-3ae6-11ef-a1de-005056013d88:/var/log/ceph:z -v /tmp/ceph-tmpxbngx708:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp94g7iyn2:/etc/ceph/ceph.conf:z ceph/ceph:v18.2.2 orch client-keyring set client.admin label:_admin /usr/bin/ceph: stderr Error EIO: Module 'cephadm' has experienced an error and cannot handle commands: ContainerInspectInfo(image_id='3c937764e6f5de1131b469dc69f0db09f8bd55cf6c983482cde518596d3dd0e5', ceph_version='ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)', repo_digests=['']) Unable to set up "admin" label; assuming older version of Ceph Saving cluster configuration to /var/lib/ceph/65bee110-3ae6-11ef-a1de-005056013d88/config directory Enabling autotune for osd_memory_target You can access the Ceph CLI as following in case of multi-cluster or non-default config: sudo /usr/sbin/cephadm shell --fsid 65bee110-3ae6-11ef-a1de-005056013d88 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring Or, if you are only running a single cluster on this host: sudo /usr/sbin/cephadm shell Please consider enabling telemetry to help improve Ceph: ceph telemetry on For more information see: https://docs.ceph.com/en/latest/mgr/telemetry/ Bootstrap complete. ]$ sudo docker exec ceph-1b19e642-3ae5-11ef-b4e4-005056013d88-mon-starlight-1 ceph -s cluster: id: 1b19e642-3ae5-11ef-b4e4-005056013d88 health: HEALTH_ERR Module 'cephadm' has failed: ContainerInspectInfo(image_id='3c937764e6f5de1131b469dc69f0db09f8bd55cf6c983482cde518596d3dd0e5', ceph_version='ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)', repo_digests=['']) OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum starlight-1 (age 2m) mgr: starlight-1.yhqrry(active, since 107s) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx