Pull failed on cluster upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Ceph users,

during an upgrade from 18.2.2 to 18.2.4 the image pull from Dockerhub failed on one machine running a monitor daemon, while it succeeded on the previous ones.

# ceph orch upgrade status
{
"target_image": "snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1",
    "in_progress": true,
    "which": "Upgrading all daemon types on all hosts",
    "services_complete": [
        "mgr"
    ],
    "progress": "3/152 daemons upgraded",
"message": "Error: UPGRADE_FAILED_PULL: Upgrade: failed to pull target image",
    "is_paused": true
}

# ceph health detail
[WRN] UPGRADE_FAILED_PULL: Upgrade: failed to pull target image
failed to pull snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 on host aka


On the affected host I can see the image has been correctly pulled, I can even delete and re-pull it manually:

$ sudo docker pull snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 docker.io/snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1: Pulling from snack14/ceph-wizard
c4df4d1fcd03: Pull complete
676ca14fffd6: Pull complete
cde0e2cfc7c9: Pull complete
Digest: sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1 Status: Downloaded newer image for snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1
docker.io/snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1


so I believe I didn't hit Dockerhub pull limits. Yet the upgrade fails again in the same way when I resume it. I noticed that the docker logs show these warnings in coincidence with a failed pull triggered by the upgrade:

Aug 05 18:24:28 aka dockerd[1564]: time="2024-08-05T18:24:28.087629058+02:00" level=warning msg="canonical references cannot be resolved: snack14/ceph-wizard@sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1" Aug 05 18:24:29 aka dockerd[1564]: time="2024-08-05T18:24:29.332143728+02:00" level=warning msg="reference for unknown type: " digest="sha256:b1994328eb078778abdba0a17a7cf7b371e7d95ee1d543bd987892941ceb91c1" remote="docker.io/snack14/ceph-wizard@sha256:b1994328eb078778>

but only the second one appears when doing a successful manual pull, so I suspect only the first one is representative of the problem.

The docker version on the affected machine is 26.1.3.

Thanks in advance for any help,

Nicola

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux