ceph orch command hung

Taku Izumi <kgh02017.g@xxxxxxxxx> · Mon, 11 Sep 2023 14:33:29 +0900

Hi all,

I have 4-nodes ceph cluster.

After I shutted down my cluster, I tried to start it again, but failed  due
to

ceph orch xxx (such as status) commands hung.

How sould I recover from this problem ?

root@ceph-manager:/# ceph orch status   ==> hung

^CInterrupted

root@ceph-manager:/# ceph status

  cluster:

    id:     4588ed80-352b-11ee-9eae-157ca4325420

    health: HEALTH_ERR

            2 failed cephadm daemon(s)

            1 filesystem is degraded

            1 filesystem is offline

            pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
flag(s) set

            10 slow ops, oldest one blocked for 3736 sec, mon.ceph-osd0 has
slow ops

  services:

    mon: 4 daemons, quorum ceph-manager,ceph-osd0,ceph-osd1,ceph-osd2 (age
64m)

    mgr: ceph-manager.kurjlh(active, since 64m), standbys: ceph-osd0.jodevs

    mds: 0/1 daemons up (1 failed), 2 standby

    osd: 3 osds: 3 up (since 64m), 3 in (since 2w)

         flags pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover

  data:

    volumes: 0/1 healthy, 1 failed

    pools:   11 pools, 243 pgs

    objects: 3.01k objects, 9.4 GiB

    usage:   28 GiB used, 2.8 TiB / 2.8 TiB avail

    pgs:     243 active+clean

root@ceph-manager:/# ceph health detail

HEALTH_ERR 2 failed cephadm daemon(s); 1 filesystem is degraded; 1
filesystem is offline;
pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set;
10 slow ops, oldest one blocked for 3741 sec, mon.ceph-osd0 has slow ops

[WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s)

    daemon rgw.sno_rgw.ceph-manager.umzmku on ceph-manager is in error state

    daemon rgw.sno_rgw.ceph-osd2.vfpmbs on ceph-osd2 is in error state

[WRN] FS_DEGRADED: 1 filesystem is degraded

    fs sno_cephfs is degraded

[ERR] MDS_ALL_DOWN: 1 filesystem is offline

    fs sno_cephfs is offline because no MDS is active for it.

[WRN] OSDMAP_FLAGS:
pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set

[WRN] SLOW_OPS: 10 slow ops, oldest one blocked for 3741 sec, mon.ceph-osd0
has slow ops
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx