Hello all, I'm trying to troubleshoot a test cluster that is attempting to deploy an old quay.io/ceph/ceph@sha256:<hash> image that no longer exists when adding a new host. The cluster is running 16.2.6 and was deployed last week with: cephadm bootstrap --mon-ip $(facter -p ipaddress) --allow-fqdn-hostname --ssh-user cephadm # Within "cephadm shell" ceph orch host add <hostname> <IP> _admin <repeated for 14 more hosts> This initial cluster worked fine and the mon/mgr/osd/crash/etc containers were all running the following image: quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c This week, we tried deploying 3 additional hosts using the same "ceph orch host add" commands and cephadm seems to be attempting to deploy the same image, but it no longer exists on quay.io. The error shows up in the active mgr's logs as: Non-zero exit code 125 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c -e NODE_NAME=<hostname> -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c -c %u %g /var/lib/ceph stat: stderr Trying to pull quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c... stat: stderr Error: Error initializing source docker://quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c: Error reading manifest sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c in quay.io/ceph/ceph: manifest unknown: manifest unknown I suspect this is because of the container_image global config option: [ceph: root@<hostname> /]# ceph config-key get config/global/container_image quay.io/ceph/ceph@sha256:31ad0a2bd8182c948cace326251ce1561804d7de948f370c8c44d29a175cc67c My questions are: * Is it expected for the cluster to reference a (potentially nonexistent) image by sha256 hash versus (eg.) the :v16 or :v16.2.6 tags? * What's the best way to get back into a state where new hosts can be added again? Is it sufficient to just update the container_image global config? Thank you! Andrew Gunnerson _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx