Re: ceph orchestator pulls strange images from docker.io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefan,

the cluster is running 17.6.2 through the board. The mentioned container
with other version don't show in the ceph -s or ceph verions.
It looks like it is host related.
One host get the correct 17.2.6 images, one get the 16.2.11 images and the
third one uses the 7.0.0-7183-g54142666 (whatever this is) images.

root@0cc47a6df330:~# ceph config-key get config/global/container_image
Error ENOENT:

root@0cc47a6df330:~# ceph config-key list |grep container_image
    "config-history/12/+mgr.0cc47a6df14e/container_image",
    "config-history/13/+mgr.0cc47aad8ce8/container_image",
    "config/mgr.0cc47a6df14e/container_image",
    "config/mgr.0cc47aad8ce8/container_image",

I've tried to set the detault image to ceph config-key set
config/global/container_image
quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
But I can not redeploy the mgr daemons, because there is no standby daemon.

root@0cc47a6df330:~# ceph orch redeploy mgr
Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No standby
MGR

But there should be:
root@0cc47a6df330:~# ceph orch ps
NAME                     HOST                             PORTS   STATUS
      REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER
ID
mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)    22s ago
2m    10.6M        -  16.2.11    de4b0b384ad4  0f31a162fa3e
mgr.0cc47aad8ce8         0cc47aad8ce8          running (16h)     8m ago
 16h     591M        -  17.2.6     22cd8daf4d70  8145c63fdc44

root@0cc47a6df330:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr              2/2  8m ago     19h  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8

I've also remove podman and containerd, kill all directories and then do a
fresh reinstall of podman, which also did not work.
It's also strange that the daemons with the wonky version got an extra
suffix.

If I would now how, I would happily nuke the whole orchestrator, podman and
everything that goes along with it, and start over. In the end it is not
that hard to start some mgr/mon daemons without podman, so I would be back
to a classical cluster.
I tried this yesterday, but the daemons still use that very strange images
and I just don't understand why.

I could just nuke the whole dev cluster, wipe all disks and start fresh
after reinstalling the hosts, but as I have to adopt 17 clusters to the
orchestrator, I rather get some learnings from the not working thing :)

Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman <stefan@xxxxxx>:

> On 14-09-2023 17:49, Boris Behrens wrote:
> > Hi,
> > I currently try to adopt our stage cluster, some hosts just pull strange
> > images.
> >
> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
> > CONTAINER ID  IMAGE                                           COMMAND
> >          CREATED        STATUS            PORTS       NAMES
> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
> >
> > root@0cc47a6df330:~# ceph orch ps
> > NAME                     HOST                             PORTS   STATUS
> >        REFRESHED  AGE  MEM USE  MEM LIM  VERSION                IMAGE ID
> >   CONTAINER ID
> > mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283
> running
> > (3m)      3m ago   3m    10.8M        -  16.2.11
> >   de4b0b384ad4  00b02cd82a1c
> > mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283
> running
> > (5s)      2s ago   4s    10.5M        -  17.0.0-7183-g54142666
> >   75e3d7089cea  662c6baa097e
> > mgr.0cc47aad8ce8         0cc47aad8ce8.f00f.gridscale.dev
> running
> > (65m)     8m ago  60m     553M        -  17.2.6
> > 22cd8daf4d70  8145c63fdc44
> >
> > Any idea what I need to do to change that?
>
> I want to get some things cleared up. What is the version you are
> running? I see three different ceph versions active now. I see you are
> running a podman ps command, but see docker images pulled. AFAIK podman
> needs a different IMAGE than docker ... or do you have a mixed setup?
>
> What does "ceph config-key get config/global/container_image" give you?
>
> ceph config-key list |grep container_image should give you a list
> (including config-history) where you can see what has been configured
> before.
>
> cephadm logs might give a clue as well.
>
> You can configure the IMAGE version / type that you want by setting the
> key and redeploy affected containers: For example (18.1.2):
>
> ceph config-key set config/global/container_image
>
> quay.io/ceph/ceph:v18.1.2@sha256:82a380c8127c42da406b7ce1281c2f3c0a86d4ba04b1f4b5f8d1036b8c24784f
>
> Gr. Stefan
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux