cephadm upgrade from octopus to pasific stuck

Satish Patel <satish.txt@xxxxxxxxx> · Wed, 31 Aug 2022 22:57:51 -0400

Hi,

I have a small cluster in the lab which has only two nodes. I have a single
monitor and two OSD nodes.

Running upgrade but somehow it stuck after upgrading mgr

ceph orch upgrade start --ceph-version 16.2.10

root@ceph1:~# ceph -s
  cluster:
    id:     f270ad9e-1f6f-11ed-b6f8-a539d87379ea
    health: HEALTH_WARN
            5 stray daemon(s) not managed by cephadm

  services:
    mon: 1 daemons, quorum ceph1 (age 22m)
    mgr: ceph1.xmbvsb(active, since 21m), standbys: ceph2.hmbdla
    osd: 6 osds: 6 up (since 23h), 6 in (since 8d)

  data:
    pools:   6 pools, 161 pgs
    objects: 20.53k objects, 85 GiB
    usage:   173 GiB used, 826 GiB / 1000 GiB avail
    pgs:     161 active+clean

  io:
    client:   0 B/s rd, 2.7 KiB/s wr, 0 op/s rd, 0 op/s wr

  progress:
    Upgrade to quay.io/ceph/ceph:v16.2.10 (0s)
      [............................]

root@ceph1:~# ceph health detail
HEALTH_WARN 5 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 5 stray daemon(s) not managed by cephadm
    stray daemon mgr.ceph1.xmbvsb on host ceph1 not managed by cephadm
    stray daemon mon.ceph1 on host ceph1 not managed by cephadm
    stray daemon osd.0 on host ceph1 not managed by cephadm
    stray daemon osd.1 on host ceph1 not managed by cephadm
    stray daemon osd.4 on host ceph1 not managed by cephadm

root@ceph1:~# ceph log last cephadm
2022-09-01T02:46:12.020993+0000 mgr.ceph1.xmbvsb (mgr.254112) 437 : cephadm
[INF] refreshing ceph2 facts
2022-09-01T02:47:12.016303+0000 mgr.ceph1.xmbvsb (mgr.254112) 469 : cephadm
[INF] refreshing ceph1 facts
2022-09-01T02:47:12.431002+0000 mgr.ceph1.xmbvsb (mgr.254112) 470 : cephadm
[INF] refreshing ceph2 facts
2022-09-01T02:48:12.424640+0000 mgr.ceph1.xmbvsb (mgr.254112) 501 : cephadm
[INF] refreshing ceph1 facts
2022-09-01T02:48:12.839790+0000 mgr.ceph1.xmbvsb (mgr.254112) 502 : cephadm
[INF] refreshing ceph2 facts
2022-09-01T02:49:12.836875+0000 mgr.ceph1.xmbvsb (mgr.254112) 534 : cephadm
[INF] refreshing ceph1 facts
2022-09-01T02:49:13.210871+0000 mgr.ceph1.xmbvsb (mgr.254112) 535 : cephadm
[INF] refreshing ceph2 facts
2022-09-01T02:50:13.207635+0000 mgr.ceph1.xmbvsb (mgr.254112) 566 : cephadm
[INF] refreshing ceph1 facts
2022-09-01T02:50:13.615722+0000 mgr.ceph1.xmbvsb (mgr.254112) 568 : cephadm
[INF] refreshing ceph2 facts

root@ceph1:~# ceph orch ps
NAME
 HOST   STATUS         REFRESHED  AGE  VERSION    IMAGE NAME
                 IMAGE ID      CONTAINER ID
cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
 ceph1  stopped        3m ago     -    <unknown>  <unknown>
                <unknown>     <unknown>
cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
 ceph2  stopped        3m ago     -    <unknown>  <unknown>
                <unknown>     <unknown>
crash.ceph2
ceph1  running (12d)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  0a009254afb0
crash.ceph2
ceph2  running (12d)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  0a009254afb0
mgr.ceph2.hmbdla
 ceph1  running (43m)  3m ago     12d  16.2.10    quay.io/ceph/ceph:v16.2.10
               0d668911f040  6274723c35f7
mgr.ceph2.hmbdla
 ceph2  running (43m)  3m ago     12d  16.2.10    quay.io/ceph/ceph:v16.2.10
               0d668911f040  6274723c35f7
node-exporter.ceph2
ceph1  running (23m)  3m ago     12d  0.18.1
quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7a6217cb1a9e
node-exporter.ceph2
ceph2  running (23m)  3m ago     12d  0.18.1
quay.io/prometheus/node-exporter:v0.18.1  e5a616e4b9cf  7a6217cb1a9e
osd.2
ceph1  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  e286fb1c6302
osd.2
ceph2  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  e286fb1c6302
osd.3
ceph1  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  d3ae5d9f694f
osd.3
ceph2  running (23h)  3m ago     12d  15.2.17    quay.io/ceph/ceph:v15
                93146564743f  d3ae5d9f694f
osd.5
ceph1  running (23h)  3m ago     8d   15.2.17    quay.io/ceph/ceph:v15
                93146564743f  405068fb474e
osd.5
ceph2  running (23h)  3m ago     8d   15.2.17    quay.io/ceph/ceph:v15
                93146564743f  405068fb474e

What could be wrong here and how to debug issue, cephadm is new to me so
not sure where to look for logs
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx