Re: quay.io image no longer existing, required for node add to repair cluster

Kai Börnert <kai.boernert@xxxxxxxxx> · Fri, 25 Feb 2022 15:52:09 +0000

Thank you very much :)

ceph config set global container_image <image-name> was the solution to 
get the new node to deploy fully,

and with ceph config set mgr mgr/cephadm/use_repo_digest true/false it 
will hopefully never repeat

now lets hope the recovery is without further trouble

Greetings,

Kai

On 2/25/22 16:43, Adam King wrote:
For the last question, cephadm has a config option for whether or not 
it tries to convert image tags to repo digest (ceph config set mgr 
mgr/cephadm/use_repo_digest true/false). I'm not sure if setting it to 
false helps if the tag has already been converted though.

In terms of getting the cluster in order,

In the case there are actually daemons on this replaced node, if this 
image doesn't exist anymore you can deploy the individual daemons on 
the host via "ceph orch daemon redeploy <daemon-name> <image-name>" to 
whatever 16.2.6 image you want to use for now. They would still be on 
a slightly different image than the other daemons but if they're the 
same minor version I imagine it's okay. Once they've been redeployed 
with a functional image and are up and running and the health warnings 
go away you can upgrade the cluster to whichever image you were 
redeploying those daemons with and then they should all end up in 
line. I do think you would need to add the host back to the cluster 
first before you could redeploy the daemons in this fashion though. 
Having the host back in the cluster, even if the daemons are all down, 
shouldn't cause issues.

If the replaced node doesn't actually have any daemons yet, maybe 
setting the global container image to an image that exists "ceph 
config set global container_image <image-name>" then adding the host I 
think should allow you to place daemons on the host as normal. Again, 
once things are healthy, you can use upgrade to make sure every daemon 
is on the same image.

- Adam King

On Fri, Feb 25, 2022 at 10:06 AM Kai Börnert <kai.boernert@xxxxxxxxx> 
wrote:

    Hi,

    what would be the correct way to move forward?

    I have a 3 node cephadm installed cluster, one node died, the
    other two
    are fine and work as expected, so no data loss, but a lot of
    remapped/degraded.

    The dead node was replaced and I wanted to add it to the cluster
    using
    "ceph orch host add"

    The current container_image seems to be global:
    quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac
    <http://quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac>

    after some update a while back to 16.2.6. (I did not set the image
    to a
    digest, ceph upgrade did this apparently)

    The new node cannot pull this image, as it no longer exists on
    quay.io <http://quay.io>.

    I tried to copy the image via docker save & docker load, however the
    digest is not filled due to security reasons this way.

    I kinda do not want to do an additional ceph upgrade until the
    health is
    back at ok.

    Is there some other way to transfer the image to the new host?

    Is it expected, that images on quay max dissapear at any time?

    Is it possible to force ceph to use a tag instead of a digest? As I
    could fix it easily myself then?

    Greetings,

    Kai

    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx