Re: cephadm picks development/latest tagged image for daemon-base (docker.io/ceph/daemon-base:latest-pacific-devel)

Arun Vinod <arunvinod.tech@xxxxxxxxx> · Fri, 4 Feb 2022 14:28:42 +0530

Hi Adam,

Found the culprit here. It was the flag '--no-minimize-config' like
you said.
When I bootstrapped the cluster without the flag -'-no-minimize-config',
all daemons were created using the image specified through the '--image'
flag.

So, when the '--no--minimize-config' was provided, somehow mgr was not
picking the specified image. Our intention of supplying this flag to the
bootstrap was to have a detailed configuration of the cluster readily
available in the ceph.conf file, which our engineers can check anytime and
modify the cluster parameters through ceph.conf file. However, it was not
working as expected, the detailed conf file was getting created during the
bootstrap runtime, but at the end of the bootstrap this file was getting
overwritten to minimal ceph configurations. Which still looks like a bug
along with its (flag --no-minimize-config) impact on the  container image
used by mgr.

Regarding the source of cephadm script, we are fetching it through dnf
install of the cephadm package, after the right repo was enabled.  Steps
followed are:

rpm -Uvh
https://download.ceph.com/rpm-16.2.7/el8/noarch/ceph-release-1-1.el8.noarch.rpm
dnf install cephadm-16.2.7-0.el8

We guess the cephadm python script getting installed is from the git repo
https://github.com/ceph/ceph/blob/v16.2.7/src/cephadm/cephadm , i did a
diff among dnf installed cephadm and cephadm from the above github link,
which have no differences.

So, in a nutshell , initiating the bootstrap with  '--image' and without
'--no-minimize-config', resolved the multiple image issue for us. One
observation we had though is, the first mgr daemon was getting created on a
different port compared to the others, pointing this out since it might be
a bug which you might be interested in.

mgr.hcictrl01.hnztid  hcictrl01  *:9283  running (4m)      26s ago    4m
  411M        -  16.2.7     231fd40524c4  ceaefccd1047
mgr.hcictrl02.xlpdzz  hcictrl02  *:8443  running (2m)      53s ago    2m
  381M        -  16.2.7     231fd40524c4  430f71e8224e
mgr.hcictrl03.pxceem  hcictrl03  *:8443  running (2m)      97s ago    2m
  384M        -  16.2.7     231fd40524c4  92b38206aa57

Running 'ceph orch redeploy mgr' brings the first mgr also to the same port
configuration as rest.

Thanks Adam for all your assistance in sorting out this issue, our entire
team highly appreciates all the insights you have provided.

Also, we are facing one more major issue with an offline bootstrap of the
ceph cluster, using --skip-pull tag. I will open a new thread on it so that
we can start fresh discussion.

Thanks and Regards,
Arun Vinod

On Fri, 4 Feb 2022 at 06:54, Adam King <adking@xxxxxxxxxx> wrote:

> Hi Arun,
>
> A couple questions. First, from where did you pull your cephadm binary
> from (the python file used for bootstrap). I know we swapped everything
> over to quay quite a bit ago (
> https://github.com/ceph/ceph/commit/b291aa47825ece9fcfe9831546e1d8355b3202e4)
> so I want to make sure if I try to recreate this I have the same version o
> the binary. Secondly, I'm curious what your reason is for supplying the
> "--no-minimize-config" flag. Were you getting some unwanted behavior
> without it?
>
> I'll see if I can figure out what's going on here. Again, I've never seen
> this before so it might be difficult for me to recreate but I'll see what I
> can do. In the meantime, hopefully using the upgrade for a workaround is at
> least okay for you.
>
> - Adam King
>
> On Thu, Feb 3, 2022 at 2:32 PM Arun Vinod <arunvinod.tech@xxxxxxxxx>
> wrote:
>
>> Hi Adam,
>>
>> Thanks for the update. In that case this looks like a bug like you
>> mentioned.
>>
>> Here are the contents of the config file used for bootstrapping.
>>
>> [global]
>>
>> osd pool default size = 2
>>
>> osd pool default min size = 1
>>
>> osd pool default pg num = 8
>>
>> osd pool default pgp num = 8
>>
>> osd recovery delay start = 60
>>
>> osd memory target = 1610612736
>>
>> osd failsafe full ratio = 1.0
>>
>> mon pg warn max object skew = 20
>>
>> mon osd nearfull ratio = 0.8
>>
>> mon osd backfillfull ratio = 0.87
>>
>> mon osd full ratio = 0.95
>>
>> mon max pg per osd = 400
>>
>> debug asok = 0/0
>>
>> debug auth = 0/0
>>
>> debug buffer = 0/0
>>
>> debug client = 0/0
>>
>> debug context = 0/0
>>
>> debug crush = 0/0
>> debug filer = 0/0
>> debug filestore = 0/0
>> debug finisher = 0/0
>> debug heartbeatmap = 0/0
>> debug journal = 0/0
>> debug journaler = 0/0
>> debug lockdep = 0/0
>> debug mds = 0/0
>> debug mds balancer = 0/0
>> debug mds locker = 0/0
>> debug mds log = 0/0
>> debug mds log expire = 0/0
>> debug mds migrator = 0/0
>> debug mon = 0/0
>> debug monc = 0/0
>> debug ms = 0/0
>> debug objclass = 0/0
>> debug objectcacher = 0/0
>> debug objecter = 0/0
>> debug optracker = 0/0
>> debug osd = 0/0
>> debug paxos = 0/0
>> debug perfcounter = 0/0
>> debug rados = 0/0
>> debug rbd = 0/0
>> debug rgw = 0/0
>> debug throttle = 0/0
>> debug timer = 0/0
>> debug tp = 0/0
>> [osd]
>> bluestore compression mode = passive
>> [mon]
>> mon osd allow primary affinity = true
>> mon allow pool delete = true
>> [client]
>> rbd cache = true
>> rbd cache writethrough until flush = true
>> rbd concurrent management ops = 20
>> admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
>> log file = /var/log/ceph/client.$pid.log
>>
>> Output of bootstrap command:
>>
>> [root@hcictrl01 stack_orchestrator]# sudo cephadm --image
>> quay.io/ceph/ceph:v16.2.7 bootstrap --skip-monitoring-stack --mon-ip
>> 10.175.41.11 --clus
>> ter-network 10.175.42.0/24 --ssh-user ceph_deploy --ssh-private-key
>> /home/ceph_deploy/.ssh/id_rsa --ssh-public-key
>> /home/ceph_deploy/.ssh/id_rsa.p
>> ub --config /home/ceph_deploy/ceph_bootstrap/ceph.conf
>> --initial-dashboard-password J959ABCFRFGE --dashboard-password-noupdate
>> --no-minimize-confi
>> g --skip-pull
>>
>> Verifying podman|docker is present...
>>
>> Verifying lvm2 is present...
>>
>> Verifying time synchronization is in place...
>>
>> Unit chronyd.service is enabled and running
>>
>> Repeating the final host check...
>>
>> podman (/bin/podman) version 3.3.1 is present
>>
>> systemctl is present
>>
>> lvcreate is present
>>
>> Unit chronyd.service is enabled and running
>>
>> Host looks OK
>>
>> Cluster fsid: dba72000-8525-11ec-b1e7-0015171590ba
>>
>> Verifying IP 10.175.41.11 port 3300 ...
>>
>> Verifying IP 10.175.41.11 port 6789 ...
>>
>> Mon IP `10.175.41.11` is in CIDR network `10.175.41.0/24`
>> <http://10.175.41.0/24>
>>
>> Ceph version: ceph version 16.2.7
>> (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
>>
>> Extracting ceph user uid/gid from container image...
>>
>> Creating initial keys...
>> Creating initial monmap...
>> Creating mon...
>> Waiting for mon to start...
>> Waiting for mon...
>> mon is available
>> Setting mon public_network to 10.175.41.0/24
>> Setting cluster_network to 10.175.42.0/24
>> Wrote config to /etc/ceph/ceph.conf
>> Wrote keyring to /etc/ceph/ceph.client.admin.keyring
>> Creating mgr...
>> Verifying port 9283 ...
>> Waiting for mgr to start...
>> Waiting for mgr...
>> mgr not available, waiting (1/15)...
>> mgr not available, waiting (2/15)...
>> mgr not available, waiting (3/15)...
>> mgr not available, waiting (4/15)...
>> mgr is available
>> Enabling cephadm module...
>> Waiting for the mgr to restart...
>> Waiting for mgr epoch 5...
>> mgr epoch 5 is available
>> Setting orchestrator backend to cephadm...
>> Using provided ssh keys...
>> Adding host hcictrl01...
>> Deploying mon service with default placement...
>> Deploying mgr service with default placement...
>> Deploying crash service with default placement...
>> Enabling the dashboard module...
>> Waiting for the mgr to restart...
>> Waiting for mgr epoch 9...
>> mgr epoch 9 is available
>> Generating a dashboard self-signed certificate...
>> Creating initial admin user...
>> Fetching dashboard port number...
>> Ceph Dashboard is now available at:
>>
>>              URL: https://hcictrl01.enclouden.com:8443/
>>             User: admin
>>         Password: J959ABCFRFGE
>>
>> Enabling client.admin keyring and conf on hosts with "admin" label
>> You can access the Ceph CLI with:
>>
>>         sudo /sbin/cephadm shell --fsid
>> dba72000-8525-11ec-b1e7-0015171590ba -c /etc/ceph/ceph.conf -k
>> /etc/ceph/ceph.client.admin.keyring
>>
>> Please consider enabling telemetry to help improve Ceph:
>>
>>         ceph telemetry on
>>
>> For more information see:
>>
>>         https://docs.ceph.com/docs/pacific/mgr/telemetry/
>>
>> Bootstrap complete.
>>
>>
>> List of containers created after bootstrap:
>>
>> [root@hcictrl01 stack_orchestrator]# podman ps
>> CONTAINER ID  IMAGE                                            COMMAND
>>             CREATED             STATUS                 PORTS       NAMES
>> c7bfdf3b5831  quay.io/ceph/ceph:v16.2.7                        -n
>> mon.hcictrl01 ...  7 minutes ago       Up 7 minutes ago
>> ceph-dba72000-8525-11ec-b1e7-0015171590ba-mon-hcictrl01
>> 67c1e6f2ff1f  quay.io/ceph/ceph:v16.2.7                        -n
>> mgr.hcictrl01....  7 minutes ago       Up 7 minutes ago
>> ceph-dba72000-8525-11ec-b1e7-0015171590ba-mgr-hcictrl01-fvopfn
>> 6e87fba9235d  docker.io/ceph/daemon-base:latest-pacific-devel  -n
>> client.crash.h...  About a minute ago  Up About a minute ago
>>  ceph-dba72000-8525-11ec-b1e7-0015171590ba-crash-hcictrl01
>>
>> [root@hcictrl01 stack_orchestrator]# ceph orch ps
>> NAME                  HOST       PORTS   STATUS         REFRESHED  AGE
>>  MEM USE  MEM LIM  VERSION               IMAGE ID      CONTAINER ID
>> crash.hcictrl01       hcictrl01          running (87s)    83s ago  87s
>>  6975k        -  16.2.5-387-g7282d81d  41387741ad94  6e87fba9235d
>> mgr.hcictrl01.fvopfn  hcictrl01  *:9283  running (7m)     83s ago   7m
>>   399M        -  16.2.7                231fd40524c4  67c1e6f2ff1f
>> mon.hcictrl01         hcictrl01          running (8m)     83s ago   8m
>>  45.4M    2048M  16.2.7                231fd40524c4  c7bfdf3b5831
>>
>> [root@hcictrl01 stack_orchestrator]# podman images
>> REPOSITORY                  TAG                   IMAGE ID      CREATED
>>     SIZE
>> quay.io/ceph/ceph           v16.2.7               231fd40524c4  2 days
>> ago    1.39 GB
>> docker.io/ceph/daemon-base  latest-pacific-devel  41387741ad94  5 months
>> ago  1.23 GB
>>
>> As you can see the crash daemon is getting created on the image '
>> docker.io/ceph/daemon-base:latest-pacific-devel'  and it's not
>> respecting the --image flag provided. Also, we are not setting any  config
>> elsewhere other than the bootstrap conf file.
>>
>>
>> I have also attached the full log of cephadm, hope you can view it from
>> email. Let me know if you need any further data.
>>
>> Thanks in advance
>>
>> Regards,
>> Arun Vinod
>>
>> On Fri, 4 Feb 2022 at 00:17, Adam King <adking@xxxxxxxxxx> wrote:
>>
>>> But, even if I gave --image flag with bootstrap the daemons created by
>>>> mgr module are using the daemon-base image, in our case its '
>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>> Which I guess is because, mgr daemon takes into consideration the
>>>> configuration parameter 'container_image', whose default value is '
>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>> What we guess is even if we provide --image flag in cephadm bootstrap,
>>>> cephadm is not updating the variable container_image with this value.
>>>> Hence, all the remaining daemons are getting created using
>>>> daemon-base image.
>>>
>>>
>>> This is not how it's supposed to work. If you provide "--image
>>> <image-name>" to bootstrap all ceph daemons deployed, including the mon/mgr
>>> deployed during bootstrap AND the daemons deployed by the cephadm mgr
>>> module afterwards should be deployed with the image provided to the
>>> "--image" parameter. You shouldn't need to set any config options or do
>>> anything extra to get that to work. If you're providing "--image" to
>>> bootstrap and this is not happening there is a serious bug (not including
>>> the fact that the bootstrap mgr/mon show the tag while others show the
>>> digest, that's purely cosmetic). If that's the case if you could post the
>>> full bootstrap output and the contents of the config file you're passing to
>>> bootstrap and maybe we can debug. I've never seen this issue before
>>> anywhere else so I have no way to recreate it (for me passing --image in
>>> bootstrap causes all ceph daemons to be deployed with that image until I
>>> explicitly specify another image through upgrade or other means).
>>>
>>> Also, regarding the non-uniform behaviour of the first mon even if
>>>> created using the same image is quite surprising. I double checked the
>>>> configuration of all mon, and could not find a major difference between
>>>> first and remaining mons. I tried to reconfigt the first mon which ended up
>>>> in the same corner. However, redeploying the specific mon with command
>>>> 'ceph orch redeploy <name> quay.io/ceph/ceph:v16.2.7, caused the first
>>>> mon also showing the same warning as rest, as it got redeployed by the mgr.
>>>
>>>
>>> Are we expecting any difference between the mon deployed by cephadm
>>>> bootstrap and mon deployed by mgr, even if we'r using the same image?
>>>> We have only the lack of warning in the first mon to state that there
>>>> might be a difference in the first mon and rest of the mons.
>>>
>>>
>>> I could maybe see some difference if you add specific config options as
>>> the mon deployed during bootstrap is deployed with basic settings. Since we
>>> can't infer config settings into the mon store until there is an existing
>>> monitor this is sort of necessary and could maybe cause some differences
>>> between that mon and others. This should be resolved by a redeploy of the
>>> mon. Can you tell me if you're setting any mon related config options in
>>> the conf you're providing to bootstrap (or if you've set any config options
>>> elsewhere). It may be that cephadm needs to actively redeploy the mon if
>>> certain options are provided in and I can look into it if I know which
>>> sorts of config options are causing the health warning. I haven't seen that
>>> health warning in my own testing (on the bootstrap mon or those deployed by
>>> the mgr module) so I'd need to know what's causing it to come about to come
>>> up with a good fix.
>>>
>>>
>>> - Adam King
>>>
>>> On Thu, Feb 3, 2022 at 11:29 AM Arun Vinod <arunvinod.tech@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hi Adam,
>>>>
>>>> Thanks for reviewing the long output.
>>>>
>>>> Like you said, it makes total sense now since the first mon and mgr are
>>>> created by cephamd bootstrap and the rest of the dameons by the mgr module.
>>>>
>>>> But, even if I gave --image flag with bootstrap the daemons created by
>>>> mgr module are using the daemon-base image, in our case its '
>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>> Which I guess is because, mgr daemon takes into consideration the
>>>> configuration parameter 'container_image', whose default value is '
>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>>
>>>> What we guess is even if we provide --image flag in cephadm bootstrap,
>>>> cephadm is not updating the variable container_image with this value.
>>>> Hence, all the remaining daemons are getting created using
>>>> daemon-base image.
>>>>
>>>> Below is the value of config 'container_image' after bootstrapping with
>>>> --image flag provided.
>>>>
>>>> [root@hcictrl01 stack_orchestrator]# ceph-conf -D | grep -i
>>>> container_image
>>>> container_image = docker.io/ceph/daemon-base:latest-pacific-devel
>>>>
>>>> However, one workaround is to provide this config in the initial
>>>> bootstrap config file and present it to the cepham bootstrap using the
>>>> flag --config, which updates the image name and all the daemons are getting
>>>> created with the same image.
>>>>
>>>> Also, regarding the non-uniform behaviour of the first mon even if
>>>> created using the same image is quite surprising. I double checked the
>>>> configuration of all mon, and could not find a major difference between
>>>> first and remaining mons. I tried to reconfigt the first mon which ended up
>>>> in the same corner. However, redeploying the specific mon with command
>>>> 'ceph orch redeploy <name> quay.io/ceph/ceph:v16.2.7, caused the first
>>>> mon also showing the same warning as rest, as it got redeployed by the mgr.
>>>>
>>>> Are we expecting any difference between the mon deployed by cephadm
>>>> bootstrap and mon deployed by mgr, even if we'r using the same image?
>>>> We have only the lack of warning in the first mon to state that there
>>>> might be a difference in the first mon and rest of the mons.
>>>>
>>>> Thanks again Adam for checking this. Your insights into this will be
>>>> highly appreciated.
>>>>
>>>> Thanks and Regards,
>>>> Arun Vinod
>>>>
>>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx