Dear Ceph users,during an upgrade from 18.2.2 to 18.2.4 the image pull from Dockerhub failed on one machine running a monitor daemon, while it succeeded on the previous ones.
# ceph orch upgrade status {"target_image": "snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1",
"in_progress": true, "which": "Upgrading all daemon types on all hosts", "services_complete": [ "mgr" ], "progress": "3/152 daemons upgraded","message": "Error: UPGRADE_FAILED_PULL: Upgrade: failed to pull target image",
"is_paused": true } # ceph health detail [WRN] UPGRADE_FAILED_PULL: Upgrade: failed to pull target imagefailed to pull snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 on host aka
On the affected host I can see the image has been correctly pulled, I can even delete and re-pull it manually:
$ sudo docker pull snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 docker.io/snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1: Pulling from snack14/ceph-wizard
c4df4d1fcd03: Pull complete 676ca14fffd6: Pull complete cde0e2cfc7c9: Pull completeDigest: sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 Status: Downloaded newer image for snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1
docker.io/snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1so I believe I didn't hit Dockerhub pull limits. Yet the upgrade fails again in the same way when I resume it. I noticed that the docker logs show these warnings in coincidence with a failed pull triggered by the upgrade:
Aug 05 18:24:28 aka dockerd[1564]: time="2024-08-05T18:24:28.087629058+02:00" level=warning msg="canonical references cannot be resolved: snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1" Aug 05 18:24:29 aka dockerd[1564]: time="2024-08-05T18:24:29.332143728+02:00" level=warning msg="reference for unknown type: " digest="sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1" remote="docker.io/snack14/ceph-wizard@sha256:b1994328eb078778>
but only the second one appears when doing a successful manual pull, so I suspect only the first one is representative of the problem.
The docker version on the affected machine is 26.1.3. Thanks in advance for any help, Nicola
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx