Re: cephadm upgrade from octopus to pasific stuck

Adam King <adking@xxxxxxxxxx> · Thu, 1 Sep 2022 12:33:52 -0400

Does "ceph orch upgrade status" give any insights (e.g. an error message of
some kind)? If not, maybe you could try looking at
https://tracker.ceph.com/issues/56485#note-2 because it seems like a
similar issue and I see you're using --ceph-version (which we need to fix,
sorry about that).

On Wed, Aug 31, 2022 at 10:58 PM Satish Patel <satish.txt@xxxxxxxxx> wrote:

> Hi,
>
> I have a small cluster in the lab which has only two nodes. I have a single
> monitor and two OSD nodes.
>
> Running upgrade but somehow it stuck after upgrading mgr
>
> ceph orch upgrade start --ceph-version 16.2.10
>
> root@ceph1:~# ceph -s
>   cluster:
>     id:     f270ad9e-1f6f-11ed-b6f8-a539d87379ea
>     health: HEALTH_WARN
>             5 stray daemon(s) not managed by cephadm
>
>   services:
>     mon: 1 daemons, quorum ceph1 (age 22m)
>     mgr: ceph1.xmbvsb(active, since 21m), standbys: ceph2.hmbdla
>     osd: 6 osds: 6 up (since 23h), 6 in (since 8d)
>
>   data:
>     pools:   6 pools, 161 pgs
>     objects: 20.53k objects, 85 GiB
>     usage:   173 GiB used, 826 GiB / 1000 GiB avail
>     pgs:     161 active+clean
>
>   io:
>     client:   0 B/s rd, 2.7 KiB/s wr, 0 op/s rd, 0 op/s wr
>
>   progress:
>     Upgrade to quay.io/ceph/ceph:v16.2.10 (0s)
>       [............................]
>
>
> root@ceph1:~# ceph health detail
> HEALTH_WARN 5 stray daemon(s) not managed by cephadm
> [WRN] CEPHADM_STRAY_DAEMON: 5 stray daemon(s) not managed by cephadm
>     stray daemon mgr.ceph1.xmbvsb on host ceph1 not managed by cephadm
>     stray daemon mon.ceph1 on host ceph1 not managed by cephadm
>     stray daemon osd.0 on host ceph1 not managed by cephadm
>     stray daemon osd.1 on host ceph1 not managed by cephadm
>     stray daemon osd.4 on host ceph1 not managed by cephadm
>
>
> root@ceph1:~# ceph log last cephadm
> 2022-09-01T02:46:12.020993+0000 mgr.ceph1.xmbvsb (mgr.254112) 437 : cephadm
> [INF] refreshing ceph2 facts
> 2022-09-01T02:47:12.016303+0000 mgr.ceph1.xmbvsb (mgr.254112) 469 : cephadm
> [INF] refreshing ceph1 facts
> 2022-09-01T02:47:12.431002+0000 mgr.ceph1.xmbvsb (mgr.254112) 470 : cephadm
> [INF] refreshing ceph2 facts
> 2022-09-01T02:48:12.424640+0000 mgr.ceph1.xmbvsb (mgr.254112) 501 : cephadm
> [INF] refreshing ceph1 facts
> 2022-09-01T02:48:12.839790+0000 mgr.ceph1.xmbvsb (mgr.254112) 502 : cephadm
> [INF] refreshing ceph2 facts
> 2022-09-01T02:49:12.836875+0000 mgr.ceph1.xmbvsb (mgr.254112) 534 : cephadm
> [INF] refreshing ceph1 facts
> 2022-09-01T02:49:13.210871+0000 mgr.ceph1.xmbvsb (mgr.254112) 535 : cephadm
> [INF] refreshing ceph2 facts
> 2022-09-01T02:50:13.207635+0000 mgr.ceph1.xmbvsb (mgr.254112) 566 : cephadm
> [INF] refreshing ceph1 facts
> 2022-09-01T02:50:13.615722+0000 mgr.ceph1.xmbvsb (mgr.254112) 568 : cephadm
> [INF] refreshing ceph2 facts
>
>
>
> root@ceph1:~# ceph orch ps
> NAME
>  HOST   STATUS         REFRESHED  AGE  VERSION    IMAGE NAME
>                  IMAGE ID      CONTAINER ID
> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>  ceph1  stopped        3m ago     -    <unknown>  <unknown>
>                 <unknown>     <unknown>
> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>  ceph2  stopped        3m ago     -    <unknown>  <unknown>
>                 <unknown>     <unknown>
> crash.ceph2
> ceph1  running (12d)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  0a009254afb0
> crash.ceph2
> ceph2  running (12d)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  0a009254afb0
> mgr.ceph2.hmbdla
>  ceph1  running (43m)  3m ago     12d  16.2.10
> quay.io/ceph/ceph:v16.2.10
>                0d668911f040  6274723c35f7
> mgr.ceph2.hmbdla
>  ceph2  running (43m)  3m ago     12d  16.2.10
> quay.io/ceph/ceph:v16.2.10
>                0d668911f040  6274723c35f7
> node-exporter.ceph2
> ceph1  running (23m)  3m ago     12d  0.18.1
> quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7a6217cb1a9e
> node-exporter.ceph2
> ceph2  running (23m)  3m ago     12d  0.18.1
> quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7a6217cb1a9e
> osd.2
> ceph1  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  e286fb1c6302
> osd.2
> ceph2  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  e286fb1c6302
> osd.3
> ceph1  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  d3ae5d9f694f
> osd.3
> ceph2  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  d3ae5d9f694f
> osd.5
> ceph1  running (23h)  3m ago     8d   15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  405068fb474e
> osd.5
> ceph2  running (23h)  3m ago     8d   15.2.17    quay.io/ceph/ceph:v15
>                 93146564743f  405068fb474e
>
>
>
> What could be wrong here and how to debug issue, cephadm is new to me so
> not sure where to look for logs
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx