Re: Ubuntu 24.02 LTS Ceph status warning

Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx> · Wed, 16 Oct 2024 14:16:43 +0000

I pulled the V19 image and cleaned up the not used images. The problem remains.

So on host hvs004 I entered the commands 'cephadm gather-facts' and 'cephadm ceph-volume lvm list' and I got sensible output without error.
Can it be a Python issue regarding coding of json input? But why this happens in a docker container on a host and not in another host...

Can anybody hint me some ideas on how to resolve this issue?

> -----Oorspronkelijk bericht-----
> Van: Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx>
> Verzonden: dinsdag 15 oktober 2024 9:28
> Aan: Eugen Block <eblock@xxxxxx>; ceph-users@xxxxxxx
> Onderwerp:  Re: Ubuntu 24.02 LTS Ceph status warning
>
> 'ceph config get mgr container_image' gives
> quay.io/ceph/ceph@sha256:200087c35811bf28e8a8073b15fa86c07cce85c575f
> 1ccd62d1d6ddbfdc6770a => OK
>
> 'ceph health detail' gives
> HEALTH_WARN failed to probe daemons or devices [WRN]
> CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
>     host hvs004 `cephadm gather-facts` failed: cephadm exited with an error
> code: 1, stderr: Traceback (most recent call last):
>   File "<frozen runpy>", line 198, in _run_module_as_main
>   File "<frozen runpy>", line 88, in _run_code
>   File "/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-
> d1b32ae31585/cephadm.a58127a8eed242cae13849ddbebcb9931d7a5410f406
> f2d264e3b1ed31d9605e/__main__.py", line 5579, in <module>
>   ...
>   File "/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-
> d1b32ae31585/cephadm.a58127a8eed242cae13849ddbebcb9931d7a5410f406
> f2d264e3b1ed31d9605e/cephadmlib/host_facts.py", line 722, in
> _fetch_apparmor
> ValueError: too many values to unpack (expected 2)
>     host hvs004 `cephadm ceph-volume` failed: cephadm exited with an error
> code: 1, stderr: Inferring config /var/lib/ceph/dd4b0610-b4d2-11ec-bb58-
> d1b32ae31585/config/ceph.conf
> Traceback (most recent call last):
>   File "<frozen runpy>", line 198, in _run_module_as_main
>   File "<frozen runpy>", line 88, in _run_code
>   File "/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-
> d1b32ae31585/cephadm.a58127a8eed242cae13849ddbebcb9931d7a5410f406
> f2d264e3b1ed31d9605e/__main__.py", line 5579, in <module>
>   ...
>   File "/var/lib/ceph/dd4b0610-b4d2-11ec-bb58-
> d1b32ae31585/cephadm.a58127a8eed242cae13849ddbebcb9931d7a5410f406
> f2d264e3b1ed31d9605e/cephadmlib/host_facts.py", line 722, in
> _fetch_apparmor
> ValueError: too many values to unpack (expected 2)
>
> I do think it's a ceph version issue, so I started to compaire the hvs004 host
> with a good behaving host hvs001. I did find this:
> 'root@hvs001~#:cephadm shell ceph -v' gives ceph version 19.2.0
> (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)
> 'root@hvs004~#:cephadm shell ceph -v' gives ceph version 19.3.0-5346-
> gcc481a63 (cc481a63bc03a534cb8e2e961293d6509ba59401) squid (dev)
>
> It seems only the shell uses a wrong docker image so I took a list from both of
> them:
> hvs0001
> ----------
> REPOSITORY                         TAG       IMAGE ID       CREATED         SIZE
> quay.io/ceph/ceph                  v19       37996728e013   2 weeks ago     1.28GB
> quay.io/ceph/ceph                  v18.2     2bc0b0f4375d   2 months ago    1.22GB
> quay.io/ceph/ceph                  v18       a27483cc3ea0   6 months ago    1.26GB
> quay.io/ceph/ceph                  v17       5a04c8b3735d   9 months ago    1.27GB
> quay.io/ceph/ceph-grafana          9.4.7     954c08fa6188   10 months ago
> 633MB
> quay.io/ceph/grafana               9.4.12    2bacad6d85d8   17 months ago   330MB
> quay.io/prometheus/prometheus      v2.43.0   a07b618ecd1d   19 months ago
> 234MB
> quay.io/prometheus/alertmanager    v0.25.0   c8568f914cd2   22 months ago
> 65.1MB
> quay.io/prometheus/node-exporter   v1.5.0    0da6a335fe13   22 months ago
> 22.5MB
> quay.io/ceph/ceph                  v17.2     0912465dcea5   2 years ago     1.34GB
> quay.io/ceph/ceph                  v17.2.3   0912465dcea5   2 years ago     1.34GB
> quay.io/ceph/ceph-grafana          8.3.5     dad864ee21e9   2 years ago     558MB
> quay.ceph.io/ceph-ci/ceph          master    c5ce177c6a5d   2 years ago     1.38GB
> quay.io/prometheus/prometheus      v2.33.4   514e6a882f6e   2 years ago
> 204MB
> quay.io/prometheus/node-exporter   v1.3.1    1dbe0e931976   2 years ago
> 20.9MB
> quay.io/prometheus/alertmanager    v0.23.0   ba2b418f427c   3 years ago
> 57.5MB
> quay.io/ceph/ceph-grafana          6.7.4     557c83e11646   3 years ago     486MB
> quay.io/prometheus/prometheus      v2.18.1   de242295e225   4 years ago
> 140MB
> quay.io/prometheus/alertmanager    v0.20.0   0881eb8f169f   4 years ago
> 52.1MB
> quay.io/prometheus/node-exporter   v0.18.1   e5a616e4b9cf   5 years ago
> 22.9MB
>
> hvs004
> ---------
> REPOSITORY                         TAG       IMAGE ID       CREATED         SIZE
> quay.ceph.io/ceph-ci/ceph          main      6e76ca06f33a   11 days ago     1.41GB
> quay.io/ceph/ceph                  <none>    37996728e013   2 weeks ago     1.28GB
> quay.io/ceph/ceph                  v17       9cea3956c04b   18 months ago   1.16GB
> quay.io/prometheus/node-exporter   v1.5.0    0da6a335fe13   22 months ago
> 22.5MB
> quay.io/prometheus/node-exporter   v1.3.1    1dbe0e931976   2 years ago
> 20.9MB
>
> I pulled on hvs004 the v19 tagged image and my cephadm shell ceph -v gave
> the correct version.
>
> It seems my docker images aren't automatically managed by ceph?
>
> Can I fix this, or do I have to pull the correct images and remove the wrong
> ones myself?
>
>
> > -----Oorspronkelijk bericht-----
> > Van: Eugen Block <eblock@xxxxxx>
> > Verzonden: vrijdag 11 oktober 2024 13:03
> > Aan: ceph-users@xxxxxxx
> > Onderwerp:  Re: Ubuntu 24.02 LTS Ceph status warning
> >
> > I don't think the warning is related to a specific ceph version. The
> > orchestrator uses the default image anyway, you can get it via:
> >
> > ceph config get mgr container_image
> >
> > 'ceph health detail' should reveal which host or daemons misbehaves. I
> > would then look into cephadm.log on that host to find more hints, what
> > exactly goes wrong. You should also look into the active MGR log, it
> > could also give you hints why that service fails.
> >
> > Zitat von Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx>:
> >
> > > I manage a 4 hosts cluster on Ubuntu 22.04 LTS with ceph installed
> > > trough cephad and containers on Docker.
> > >
> > > Last month, I've migrated to the latest Ceph 19.2. All went great.
> > >
> > > Last week I've upgraded one of my hosts to Ubuntu 24.04.1 LTS Now I
> > > get the following warning in cephadm shell -- ceph status:
> > > Failed to apply 1 service(s): osd.all-available-devices failed to
> > > probe daemons or devices
> > >
> > > Outside the ceph shell:
> > > Ceph -v results in 'ceph version 19.2.0~git20240301.4c76c50
> > > (4c76c50a73f63ba48ccdf0adccce03b00d1d80c7) squid (dev)'
> > >
> > > Inside the shell: 'ceph version 19.3.0-5346-gcc481a63
> > > (cc481a63bc03a534cb8e2e961293d6509ba59401) squid (dev)'
> > > All osd's, mon's , mgr's and mds's are on 19.2.0` (image id
> > > 37996728e013)
> > >
> > > Do I get the warning because the Ubuntu package of ceph is still on
> > > a development version?
> > > Or can I have another underling problem?
> > >
> > > Thanks for the help.
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > > email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email
> to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx