Re: cephadm picks development/latest tagged image for daemon-base (docker.io/ceph/daemon-base:latest-pacific-devel)

Adam King <adking@xxxxxxxxxx> · Fri, 4 Feb 2022 11:52:23 -0500

Hi Arun,

Not too sure about the port thing (I'll look into that when I have a
chance) but it does look like a bug with bootstrapping with the
'--no-minimize-config' flag. I opened a tracker issue for it
https://tracker.ceph.com/issues/54141.

Thanks for helping find this bug,

- Adam King

On Fri, Feb 4, 2022 at 3:59 AM Arun Vinod <arunvinod.tech@xxxxxxxxx> wrote:

> Hi Adam,
>
> Found the culprit here. It was the flag '--no-minimize-config' like
> you said.
> When I bootstrapped the cluster without the flag -'-no-minimize-config',
> all daemons were created using the image specified through the '--image'
> flag.
>
> So, when the '--no--minimize-config' was provided, somehow mgr was not
> picking the specified image. Our intention of supplying this flag to the
> bootstrap was to have a detailed configuration of the cluster readily
> available in the ceph.conf file, which our engineers can check anytime and
> modify the cluster parameters through ceph.conf file. However, it was not
> working as expected, the detailed conf file was getting created during the
> bootstrap runtime, but at the end of the bootstrap this file was getting
> overwritten to minimal ceph configurations. Which still looks like a bug
> along with its (flag --no-minimize-config) impact on the  container image
> used by mgr.
>
> Regarding the source of cephadm script, we are fetching it through dnf
> install of the cephadm package, after the right repo was enabled.  Steps
> followed are:
>
> rpm -Uvh
> https://download.ceph.com/rpm-16.2.7/el8/noarch/ceph-release-1-1.el8.noarch.rpm
> dnf install cephadm-16.2.7-0.el8
>
> We guess the cephadm python script getting installed is from the git repo
> https://github.com/ceph/ceph/blob/v16.2.7/src/cephadm/cephadm , i did a
> diff among dnf installed cephadm and cephadm from the above github link,
> which have no differences.
>
> So, in a nutshell , initiating the bootstrap with  '--image' and without
> '--no-minimize-config', resolved the multiple image issue for us. One
> observation we had though is, the first mgr daemon was getting created on a
> different port compared to the others, pointing this out since it might be
> a bug which you might be interested in.
>
> mgr.hcictrl01.hnztid  hcictrl01  *:9283  running (4m)      26s ago    4m
>   411M        -  16.2.7     231fd40524c4  ceaefccd1047
> mgr.hcictrl02.xlpdzz  hcictrl02  *:8443  running (2m)      53s ago    2m
>   381M        -  16.2.7     231fd40524c4  430f71e8224e
> mgr.hcictrl03.pxceem  hcictrl03  *:8443  running (2m)      97s ago    2m
>   384M        -  16.2.7     231fd40524c4  92b38206aa57
>
> Running 'ceph orch redeploy mgr' brings the first mgr also to the same
> port configuration as rest.
>
> Thanks Adam for all your assistance in sorting out this issue, our entire
> team highly appreciates all the insights you have provided.
>
> Also, we are facing one more major issue with an offline bootstrap of the
> ceph cluster, using --skip-pull tag. I will open a new thread on it so that
> we can start fresh discussion.
>
> Thanks and Regards,
> Arun Vinod
>
> On Fri, 4 Feb 2022 at 06:54, Adam King <adking@xxxxxxxxxx> wrote:
>
>> Hi Arun,
>>
>> A couple questions. First, from where did you pull your cephadm binary
>> from (the python file used for bootstrap). I know we swapped everything
>> over to quay quite a bit ago (
>> https://github.com/ceph/ceph/commit/b291aa47825ece9fcfe9831546e1d8355b3202e4)
>> so I want to make sure if I try to recreate this I have the same version o
>> the binary. Secondly, I'm curious what your reason is for supplying the
>> "--no-minimize-config" flag. Were you getting some unwanted behavior
>> without it?
>>
>> I'll see if I can figure out what's going on here. Again, I've never seen
>> this before so it might be difficult for me to recreate but I'll see what I
>> can do. In the meantime, hopefully using the upgrade for a workaround is at
>> least okay for you.
>>
>> - Adam King
>>
>> On Thu, Feb 3, 2022 at 2:32 PM Arun Vinod <arunvinod.tech@xxxxxxxxx>
>> wrote:
>>
>>> Hi Adam,
>>>
>>> Thanks for the update. In that case this looks like a bug like you
>>> mentioned.
>>>
>>> Here are the contents of the config file used for bootstrapping.
>>>
>>> [global]
>>>
>>> osd pool default size = 2
>>>
>>> osd pool default min size = 1
>>>
>>> osd pool default pg num = 8
>>>
>>> osd pool default pgp num = 8
>>>
>>> osd recovery delay start = 60
>>>
>>> osd memory target = 1610612736
>>>
>>> osd failsafe full ratio = 1.0
>>>
>>> mon pg warn max object skew = 20
>>>
>>> mon osd nearfull ratio = 0.8
>>>
>>> mon osd backfillfull ratio = 0.87
>>>
>>> mon osd full ratio = 0.95
>>>
>>> mon max pg per osd = 400
>>>
>>> debug asok = 0/0
>>>
>>> debug auth = 0/0
>>>
>>> debug buffer = 0/0
>>>
>>> debug client = 0/0
>>>
>>> debug context = 0/0
>>>
>>> debug crush = 0/0
>>> debug filer = 0/0
>>> debug filestore = 0/0
>>> debug finisher = 0/0
>>> debug heartbeatmap = 0/0
>>> debug journal = 0/0
>>> debug journaler = 0/0
>>> debug lockdep = 0/0
>>> debug mds = 0/0
>>> debug mds balancer = 0/0
>>> debug mds locker = 0/0
>>> debug mds log = 0/0
>>> debug mds log expire = 0/0
>>> debug mds migrator = 0/0
>>> debug mon = 0/0
>>> debug monc = 0/0
>>> debug ms = 0/0
>>> debug objclass = 0/0
>>> debug objectcacher = 0/0
>>> debug objecter = 0/0
>>> debug optracker = 0/0
>>> debug osd = 0/0
>>> debug paxos = 0/0
>>> debug perfcounter = 0/0
>>> debug rados = 0/0
>>> debug rbd = 0/0
>>> debug rgw = 0/0
>>> debug throttle = 0/0
>>> debug timer = 0/0
>>> debug tp = 0/0
>>> [osd]
>>> bluestore compression mode = passive
>>> [mon]
>>> mon osd allow primary affinity = true
>>> mon allow pool delete = true
>>> [client]
>>> rbd cache = true
>>> rbd cache writethrough until flush = true
>>> rbd concurrent management ops = 20
>>> admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
>>> log file = /var/log/ceph/client.$pid.log
>>>
>>> Output of bootstrap command:
>>>
>>> [root@hcictrl01 stack_orchestrator]# sudo cephadm --image
>>> quay.io/ceph/ceph:v16.2.7 bootstrap --skip-monitoring-stack --mon-ip
>>> 10.175.41.11 --clus
>>> ter-network 10.175.42.0/24 --ssh-user ceph_deploy --ssh-private-key
>>> /home/ceph_deploy/.ssh/id_rsa --ssh-public-key
>>> /home/ceph_deploy/.ssh/id_rsa.p
>>> ub --config /home/ceph_deploy/ceph_bootstrap/ceph.conf
>>> --initial-dashboard-password J959ABCFRFGE --dashboard-password-noupdate
>>> --no-minimize-confi
>>> g --skip-pull
>>>
>>> Verifying podman|docker is present...
>>>
>>> Verifying lvm2 is present...
>>>
>>> Verifying time synchronization is in place...
>>>
>>> Unit chronyd.service is enabled and running
>>>
>>> Repeating the final host check...
>>>
>>> podman (/bin/podman) version 3.3.1 is present
>>>
>>> systemctl is present
>>>
>>> lvcreate is present
>>>
>>> Unit chronyd.service is enabled and running
>>>
>>> Host looks OK
>>>
>>> Cluster fsid: dba72000-8525-11ec-b1e7-0015171590ba
>>>
>>> Verifying IP 10.175.41.11 port 3300 ...
>>>
>>> Verifying IP 10.175.41.11 port 6789 ...
>>>
>>> Mon IP `10.175.41.11` is in CIDR network `10.175.41.0/24`
>>> <http://10.175.41.0/24>
>>>
>>> Ceph version: ceph version 16.2.7
>>> (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
>>>
>>> Extracting ceph user uid/gid from container image...
>>>
>>> Creating initial keys...
>>> Creating initial monmap...
>>> Creating mon...
>>> Waiting for mon to start...
>>> Waiting for mon...
>>> mon is available
>>> Setting mon public_network to 10.175.41.0/24
>>> Setting cluster_network to 10.175.42.0/24
>>> Wrote config to /etc/ceph/ceph.conf
>>> Wrote keyring to /etc/ceph/ceph.client.admin.keyring
>>> Creating mgr...
>>> Verifying port 9283 ...
>>> Waiting for mgr to start...
>>> Waiting for mgr...
>>> mgr not available, waiting (1/15)...
>>> mgr not available, waiting (2/15)...
>>> mgr not available, waiting (3/15)...
>>> mgr not available, waiting (4/15)...
>>> mgr is available
>>> Enabling cephadm module...
>>> Waiting for the mgr to restart...
>>> Waiting for mgr epoch 5...
>>> mgr epoch 5 is available
>>> Setting orchestrator backend to cephadm...
>>> Using provided ssh keys...
>>> Adding host hcictrl01...
>>> Deploying mon service with default placement...
>>> Deploying mgr service with default placement...
>>> Deploying crash service with default placement...
>>> Enabling the dashboard module...
>>> Waiting for the mgr to restart...
>>> Waiting for mgr epoch 9...
>>> mgr epoch 9 is available
>>> Generating a dashboard self-signed certificate...
>>> Creating initial admin user...
>>> Fetching dashboard port number...
>>> Ceph Dashboard is now available at:
>>>
>>>              URL: https://hcictrl01.enclouden.com:8443/
>>>             User: admin
>>>         Password: J959ABCFRFGE
>>>
>>> Enabling client.admin keyring and conf on hosts with "admin" label
>>> You can access the Ceph CLI with:
>>>
>>>         sudo /sbin/cephadm shell --fsid
>>> dba72000-8525-11ec-b1e7-0015171590ba -c /etc/ceph/ceph.conf -k
>>> /etc/ceph/ceph.client.admin.keyring
>>>
>>> Please consider enabling telemetry to help improve Ceph:
>>>
>>>         ceph telemetry on
>>>
>>> For more information see:
>>>
>>>         https://docs.ceph.com/docs/pacific/mgr/telemetry/
>>>
>>> Bootstrap complete.
>>>
>>>
>>> List of containers created after bootstrap:
>>>
>>> [root@hcictrl01 stack_orchestrator]# podman ps
>>> CONTAINER ID  IMAGE                                            COMMAND
>>>             CREATED             STATUS                 PORTS       NAMES
>>> c7bfdf3b5831  quay.io/ceph/ceph:v16.2.7                        -n
>>> mon.hcictrl01 ...  7 minutes ago       Up 7 minutes ago
>>> ceph-dba72000-8525-11ec-b1e7-0015171590ba-mon-hcictrl01
>>> 67c1e6f2ff1f  quay.io/ceph/ceph:v16.2.7                        -n
>>> mgr.hcictrl01....  7 minutes ago       Up 7 minutes ago
>>> ceph-dba72000-8525-11ec-b1e7-0015171590ba-mgr-hcictrl01-fvopfn
>>> 6e87fba9235d  docker.io/ceph/daemon-base:latest-pacific-devel  -n
>>> client.crash.h...  About a minute ago  Up About a minute ago
>>>  ceph-dba72000-8525-11ec-b1e7-0015171590ba-crash-hcictrl01
>>>
>>> [root@hcictrl01 stack_orchestrator]# ceph orch ps
>>> NAME                  HOST       PORTS   STATUS         REFRESHED  AGE
>>>  MEM USE  MEM LIM  VERSION               IMAGE ID      CONTAINER ID
>>> crash.hcictrl01       hcictrl01          running (87s)    83s ago  87s
>>>  6975k        -  16.2.5-387-g7282d81d  41387741ad94  6e87fba9235d
>>> mgr.hcictrl01.fvopfn  hcictrl01  *:9283  running (7m)     83s ago   7m
>>>   399M        -  16.2.7                231fd40524c4  67c1e6f2ff1f
>>> mon.hcictrl01         hcictrl01          running (8m)     83s ago   8m
>>>  45.4M    2048M  16.2.7                231fd40524c4  c7bfdf3b5831
>>>
>>> [root@hcictrl01 stack_orchestrator]# podman images
>>> REPOSITORY                  TAG                   IMAGE ID      CREATED
>>>       SIZE
>>> quay.io/ceph/ceph           v16.2.7               231fd40524c4  2 days
>>> ago    1.39 GB
>>> docker.io/ceph/daemon-base  latest-pacific-devel  41387741ad94  5
>>> months ago  1.23 GB
>>>
>>> As you can see the crash daemon is getting created on the image '
>>> docker.io/ceph/daemon-base:latest-pacific-devel'  and it's not
>>> respecting the --image flag provided. Also, we are not setting any  config
>>> elsewhere other than the bootstrap conf file.
>>>
>>>
>>> I have also attached the full log of cephadm, hope you can view it from
>>> email. Let me know if you need any further data.
>>>
>>> Thanks in advance
>>>
>>> Regards,
>>> Arun Vinod
>>>
>>> On Fri, 4 Feb 2022 at 00:17, Adam King <adking@xxxxxxxxxx> wrote:
>>>
>>>> But, even if I gave --image flag with bootstrap the daemons created by
>>>>> mgr module are using the daemon-base image, in our case its '
>>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>>> Which I guess is because, mgr daemon takes into consideration the
>>>>> configuration parameter 'container_image', whose default value is '
>>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>>> What we guess is even if we provide --image flag in cephadm bootstrap,
>>>>> cephadm is not updating the variable container_image with this value.
>>>>> Hence, all the remaining daemons are getting created using
>>>>> daemon-base image.
>>>>
>>>>
>>>> This is not how it's supposed to work. If you provide "--image
>>>> <image-name>" to bootstrap all ceph daemons deployed, including the mon/mgr
>>>> deployed during bootstrap AND the daemons deployed by the cephadm mgr
>>>> module afterwards should be deployed with the image provided to the
>>>> "--image" parameter. You shouldn't need to set any config options or do
>>>> anything extra to get that to work. If you're providing "--image" to
>>>> bootstrap and this is not happening there is a serious bug (not including
>>>> the fact that the bootstrap mgr/mon show the tag while others show the
>>>> digest, that's purely cosmetic). If that's the case if you could post the
>>>> full bootstrap output and the contents of the config file you're passing to
>>>> bootstrap and maybe we can debug. I've never seen this issue before
>>>> anywhere else so I have no way to recreate it (for me passing --image in
>>>> bootstrap causes all ceph daemons to be deployed with that image until I
>>>> explicitly specify another image through upgrade or other means).
>>>>
>>>> Also, regarding the non-uniform behaviour of the first mon even if
>>>>> created using the same image is quite surprising. I double checked the
>>>>> configuration of all mon, and could not find a major difference between
>>>>> first and remaining mons. I tried to reconfigt the first mon which ended up
>>>>> in the same corner. However, redeploying the specific mon with command
>>>>> 'ceph orch redeploy <name> quay.io/ceph/ceph:v16.2.7, caused the
>>>>> first mon also showing the same warning as rest, as it got redeployed by
>>>>> the mgr.
>>>>
>>>>
>>>> Are we expecting any difference between the mon deployed by cephadm
>>>>> bootstrap and mon deployed by mgr, even if we'r using the same image?
>>>>> We have only the lack of warning in the first mon to state that there
>>>>> might be a difference in the first mon and rest of the mons.
>>>>
>>>>
>>>> I could maybe see some difference if you add specific config options as
>>>> the mon deployed during bootstrap is deployed with basic settings. Since we
>>>> can't infer config settings into the mon store until there is an existing
>>>> monitor this is sort of necessary and could maybe cause some differences
>>>> between that mon and others. This should be resolved by a redeploy of the
>>>> mon. Can you tell me if you're setting any mon related config options in
>>>> the conf you're providing to bootstrap (or if you've set any config options
>>>> elsewhere). It may be that cephadm needs to actively redeploy the mon if
>>>> certain options are provided in and I can look into it if I know which
>>>> sorts of config options are causing the health warning. I haven't seen that
>>>> health warning in my own testing (on the bootstrap mon or those deployed by
>>>> the mgr module) so I'd need to know what's causing it to come about to come
>>>> up with a good fix.
>>>>
>>>>
>>>> - Adam King
>>>>
>>>> On Thu, Feb 3, 2022 at 11:29 AM Arun Vinod <arunvinod.tech@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Hi Adam,
>>>>>
>>>>> Thanks for reviewing the long output.
>>>>>
>>>>> Like you said, it makes total sense now since the first mon and mgr
>>>>> are created by cephamd bootstrap and the rest of the dameons by the mgr
>>>>> module.
>>>>>
>>>>> But, even if I gave --image flag with bootstrap the daemons created by
>>>>> mgr module are using the daemon-base image, in our case its '
>>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>>> Which I guess is because, mgr daemon takes into consideration the
>>>>> configuration parameter 'container_image', whose default value is '
>>>>> docker.io/ceph/daemon-base:latest-pacific-devel'.
>>>>>
>>>>> What we guess is even if we provide --image flag in cephadm bootstrap,
>>>>> cephadm is not updating the variable container_image with this value.
>>>>> Hence, all the remaining daemons are getting created using
>>>>> daemon-base image.
>>>>>
>>>>> Below is the value of config 'container_image' after
>>>>> bootstrapping with --image flag provided.
>>>>>
>>>>> [root@hcictrl01 stack_orchestrator]# ceph-conf -D | grep -i
>>>>> container_image
>>>>> container_image = docker.io/ceph/daemon-base:latest-pacific-devel
>>>>>
>>>>> However, one workaround is to provide this config in the initial
>>>>> bootstrap config file and present it to the cepham bootstrap using the
>>>>> flag --config, which updates the image name and all the daemons are getting
>>>>> created with the same image.
>>>>>
>>>>> Also, regarding the non-uniform behaviour of the first mon even if
>>>>> created using the same image is quite surprising. I double checked the
>>>>> configuration of all mon, and could not find a major difference between
>>>>> first and remaining mons. I tried to reconfigt the first mon which ended up
>>>>> in the same corner. However, redeploying the specific mon with command
>>>>> 'ceph orch redeploy <name> quay.io/ceph/ceph:v16.2.7, caused the
>>>>> first mon also showing the same warning as rest, as it got redeployed by
>>>>> the mgr.
>>>>>
>>>>> Are we expecting any difference between the mon deployed by cephadm
>>>>> bootstrap and mon deployed by mgr, even if we'r using the same image?
>>>>> We have only the lack of warning in the first mon to state that there
>>>>> might be a difference in the first mon and rest of the mons.
>>>>>
>>>>> Thanks again Adam for checking this. Your insights into this will be
>>>>> highly appreciated.
>>>>>
>>>>> Thanks and Regards,
>>>>> Arun Vinod
>>>>>
>>>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx