Re: ceph orchestator managed daemons do not use authentication (was: ceph orchestrator pulls strange images from docker.io)

Boris Behrens <bb@xxxxxxxxx> · Fri, 15 Sep 2023 13:01:20 +0200

Oh, we found the issue. A very old update was stuck in the pipeline. We
canceled it and then the correct images got pulled.

Now on to the next issue.
Daemons that start have problems talking to the cluster

# podman logs 72248bafb0d3
2023-09-15T10:47:30.740+0000 7f2943559700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [1] but i only support [1]
2023-09-15T10:47:30.740+0000 7f294ac601c0 -1 mgr init Authentication
failed, did you specify a mgr ID with a valid keyring?
Error in initialization: (13) Permission denied

When we add the following lines to the mgr config and restart the daemon,
it works flawlessly
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

Did I miss some config value that needs to be set?

Trying the same with a new mon, will not work.
2023-09-15T10:59:28.960+0000 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported
2023-09-15T10:59:32.164+0000 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported
2023-09-15T10:59:38.568+0000 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported

I added the mon via:
ceph orch daemon add mon FQDN:[IPv6_address]

Am Fr., 15. Sept. 2023 um 09:21 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:

> Hi Stefan,
>
> the cluster is running 17.6.2 through the board. The mentioned container
> with other version don't show in the ceph -s or ceph verions.
> It looks like it is host related.
> One host get the correct 17.2.6 images, one get the 16.2.11 images and the
> third one uses the 7.0.0-7183-g54142666 (whatever this is) images.
>
> root@0cc47a6df330:~# ceph config-key get config/global/container_image
> Error ENOENT:
>
> root@0cc47a6df330:~# ceph config-key list |grep container_image
>     "config-history/12/+mgr.0cc47a6df14e/container_image",
>     "config-history/13/+mgr.0cc47aad8ce8/container_image",
>     "config/mgr.0cc47a6df14e/container_image",
>     "config/mgr.0cc47aad8ce8/container_image",
>
> I've tried to set the detault image to ceph config-key set
> config/global/container_image
> quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
> But I can not redeploy the mgr daemons, because there is no standby daemon.
>
> root@0cc47a6df330:~# ceph orch redeploy mgr
> Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No standby
> MGR
>
> But there should be:
> root@0cc47a6df330:~# ceph orch ps
> NAME                     HOST                             PORTS   STATUS
>       REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER
> ID
> mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)    22s ago
> 2m    10.6M        -  16.2.11    de4b0b384ad4  0f31a162fa3e
> mgr.0cc47aad8ce8         0cc47aad8ce8          running (16h)     8m ago
>  16h     591M        -  17.2.6     22cd8daf4d70  8145c63fdc44
>
> root@0cc47a6df330:~# ceph orch ls
> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
> mgr              2/2  8m ago     19h
>  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8
>
> I've also remove podman and containerd, kill all directories and then do a
> fresh reinstall of podman, which also did not work.
> It's also strange that the daemons with the wonky version got an extra
> suffix.
>
> If I would now how, I would happily nuke the whole orchestrator, podman
> and everything that goes along with it, and start over. In the end it is
> not that hard to start some mgr/mon daemons without podman, so I would be
> back to a classical cluster.
> I tried this yesterday, but the daemons still use that very strange images
> and I just don't understand why.
>
> I could just nuke the whole dev cluster, wipe all disks and start fresh
> after reinstalling the hosts, but as I have to adopt 17 clusters to the
> orchestrator, I rather get some learnings from the not working thing :)
>
> Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman <stefan@xxxxxx>:
>
>> On 14-09-2023 17:49, Boris Behrens wrote:
>> > Hi,
>> > I currently try to adopt our stage cluster, some hosts just pull strange
>> > images.
>> >
>> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
>> > CONTAINER ID  IMAGE                                           COMMAND
>> >          CREATED        STATUS            PORTS       NAMES
>> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
>> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
>> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
>> >
>> > root@0cc47a6df330:~# ceph orch ps
>> > NAME                     HOST                             PORTS   STATUS
>> >        REFRESHED  AGE  MEM USE  MEM LIM  VERSION                IMAGE ID
>> >   CONTAINER ID
>> > mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283
>> running
>> > (3m)      3m ago   3m    10.8M        -  16.2.11
>> >   de4b0b384ad4  00b02cd82a1c
>> > mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283
>> running
>> > (5s)      2s ago   4s    10.5M        -  17.0.0-7183-g54142666
>> >   75e3d7089cea  662c6baa097e
>> > mgr.0cc47aad8ce8         0cc47aad8ce8.f00f.gridscale.dev
>> running
>> > (65m)     8m ago  60m     553M        -  17.2.6
>> > 22cd8daf4d70  8145c63fdc44
>> >
>> > Any idea what I need to do to change that?
>>
>> I want to get some things cleared up. What is the version you are
>> running? I see three different ceph versions active now. I see you are
>> running a podman ps command, but see docker images pulled. AFAIK podman
>> needs a different IMAGE than docker ... or do you have a mixed setup?
>>
>> What does "ceph config-key get config/global/container_image" give you?
>>
>> ceph config-key list |grep container_image should give you a list
>> (including config-history) where you can see what has been configured
>> before.
>>
>> cephadm logs might give a clue as well.
>>
>> You can configure the IMAGE version / type that you want by setting the
>> key and redeploy affected containers: For example (18.1.2):
>>
>> ceph config-key set config/global/container_image
>>
>> quay.io/ceph/ceph:v18.1.2@sha256:82a380c8127c42da406b7ce1281c2f3c0a86d4ba04b1f4b5f8d1036b8c24784f
>>
>> Gr. Stefan
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx