ceph progress bar stuck and 3rd manager not deploying

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

I have a test ceph octopus 16.2.5 cluster with cephadm out of 7 nodes on Ubuntu 20.04 LTS bare metal. I just upgraded each node's kernel and performed a rolling reboot and now the ceph -s output is stuck somehow and the manager service is only deployed to two nodes instead of 3 nodes. Here would be the ceph -s output:

  cluster:
    id:     fb48d256-f43d-11eb-9f74-7fd39d4b232a
    health: HEALTH_WARN
            OSD count 1 < osd_pool_default_size 3

  services:
    mon: 2 daemons, quorum ceph1a,ceph1c (age 25m)
    mgr: ceph1a.guidwn(active, since 25m), standbys: ceph1c.bttxuu
    osd: 1 osds: 1 up (since 30m), 1 in (since 3w)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   5.3 MiB used, 7.0 TiB / 7.0 TiB avail
    pgs:

  progress:
    Updating crash deployment (-1 -> 6) (0s)
      [............................]

Ignore the HEALTH_WARN with of the OSD count because I have not finished to deploy all 3 OSDs. But you can see that the progress bar is stuck and I have only 2 managers, the third manager does not seem to start as can be seen here:

$ ceph orch ps|grep stopped
mon.ceph1b            ceph1b               stopped           4m ago   4w        -    2048M  <unknown>  <unknown>     <unknown>

It looks like the orchestrator is stuck and does not continue it's job. Any idea how I can get it unstuck?

Best regards,
Mabi

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux