Hi Guys I am busy removing an OSD from my rook-ceph cluster. I did 'ceph osd out osd.7' and the re-balancing process started. Now it has stalled with one pg on "active+undersized+degraded". I have done this before and it has worked fine. # ceph health detail HEALTH_WARN Degraded data redundancy: 15/94659 objects degraded (0.016%), 1 pg degraded, 1 pg undersized [WRN] PG_DEGRADED: Degraded data redundancy: 15/94659 objects degraded (0.016%), 1 pg degraded, 1 pg undersized pg 3.1f is stuck undersized for 2h, current state active+undersized+degraded, last acting [0,2] # ceph pg dump_stuck PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 3.1f active+undersized+degraded [0,2] 0 [0,2] 0 I have lots of OSDs on different nodes: # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 13.77573 root default -5 13.77573 region FSN1 -22 0.73419 zone FSN1-DC13 -21 0 host node5-redacted-com -27 0.73419 host node7-redacted-com 1 ssd 0.36710 osd.1 up 1.00000 1.00000 5 ssd 0.36710 osd.5 up 1.00000 1.00000 -10 6.20297 zone FSN1-DC14 -9 6.20297 host node3-redacted-com 2 ssd 3.10149 osd.2 up 1.00000 1.00000 4 ssd 3.10149 osd.4 up 1.00000 1.00000 -18 3.19919 zone FSN1-DC15 -17 3.19919 host node4-redacted-com 7 ssd 3.19919 osd.7 down 0 1.00000 -4 2.90518 zone FSN1-DC16 -3 2.90518 host node1-redacted-com 0 ssd 1.45259 osd.0 up 1.00000 1.00000 3 ssd 1.45259 osd.3 up 1.00000 1.00000 -14 0.73419 zone FSN1-DC18 -13 0 host node2-redacted-com -25 0.73419 host node6-redacted-com 10 ssd 0.36710 osd.10 up 1.00000 1.00000 11 ssd 0.36710 osd.11 up 1.00000 1.00000 Any ideas on how to fix this? Thanks David _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx