Re: 2 pg in 'active+undersized+degraded' state

Gaurav Bafna <bafnag@xxxxxxxxx> · Wed, 11 May 2016 19:08:08 +0530

Hi Max,

I encountered same error with my 3 node cluster few days ago. When I
added a fourth node to the cluster , the PGs came back to healthy
state. It seems to be a corner case of CRUSH algorithm which hits only
in a small cluster.

Quoting from another ceph user : "  yes the pg should get remapped,
but that is not always the case. For discussion on thi, check out the
tracker below. Your particular circumstances may be a little
different, but the idea is the same.

http://tracker.ceph.com/issues/3806 "

Thanks
Gaurav

On Wed, May 11, 2016 at 5:41 PM, Max Vernimmen
<m.vernimmen@xxxxxxxxxxxxxxx> wrote:
> Hi,
>
>
>
> I’m looking for some help in figuring out why there are 2 pg’s in our
> cluster in 'active+undersized+degraded' state. They don’t seem to get
> assigned a 3rd osd to place data on. I’m not sure why, everything looks ‘ok’
> to me. Our ceph cluster consists of 3 nodes and has been upgraded from
> firefly to hammer to infernalis during its lifetime. Everything was fine
> until a few days ago when I created a new pool with
>
> # ceph osd pool create poolname 2048 replicated
>
> and proceeded to create two rbd images
>
> # rbd create --size 2T poolname/image1
>
> # rbd create --size 1600G poolname/image2
>
>
>
> Since that moment ceph health shows a warning:
>
>     cluster 6318a6a2-808b-45a1-9c89-31575c58de49
>
>      health HEALTH_WARN
>
>             2 pgs degraded
>
>             2 pgs stuck degraded
>
>             2 pgs stuck unclean
>
>             2 pgs stuck undersized
>
>             2 pgs undersized
>
>             recovery 389/9308133 objects degraded (0.004%)
>
>      monmap e7: 4 mons at
> {md002=172.19.20.2:6789/0,md005=172.19.20.5:6789/0,md008=172.19.20.8:6789/0,md010=172.19.20.10:6789/0}
>
>             election epoch 18774, quorum 0,1,2,3 md002,md005,md008,md010
>
>      osdmap e105161: 30 osds: 30 up, 30 in
>
>       pgmap v12776313: 2880 pgs, 5 pools, 12089 GB data, 3029 kobjects
>
>             36771 GB used, 24661 GB / 61433 GB avail
>
>             389/9308133 objects degraded (0.004%)
>
>                 2878 active+clean
>
>                    2 active+undersized+degraded
>
>   client io 1883 kB/s rd, 15070 B/s wr, 1 op/s
>
>
>
> There is no recovery going on.
>
> We are on version 9.2.1 on centos 7 with kernel 4.4.9 except for the
> monitoring node which is still on 4.4.0.
>
>
>
> I’ve used crushtool to check whether the mapping should be ok, it seems to
> be fine (but I think this assumes all nodes in a cluster to be exactly the
> same, which they are not in our situation).
>
> There are no errors in the ceph logs. (zgrep –i err *gz in /var/log/ceph)
>
> Pg-num and pgp-num are both set to 3 for this pool.
>
>
>
> The details:
>
> HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 pgs
> stuck undersized; 2 pgs undersized; recovery 389/9308133 objects degraded
> (0.004%)
>
> pg 24.17 is stuck unclean since forever, current state
> active+undersized+degraded, last acting [23,1]
>
> pg 24.54a is stuck unclean since forever, current state
> active+undersized+degraded, last acting [8,19]
>
> pg 24.17 is stuck undersized for 9653.439112, current state
> active+undersized+degraded, last acting [23,1]
>
> pg 24.54a is stuck undersized for 9659.961863, current state
> active+undersized+degraded, last acting [8,19]
>
> pg 24.17 is stuck degraded for 9653.439186, current state
> active+undersized+degraded, last acting [23,1]
>
> pg 24.54a is stuck degraded for 9659.961940, current state
> active+undersized+degraded, last acting [8,19]
>
> pg 24.54a is active+undersized+degraded, acting [8,19]
>
> pg 24.17 is active+undersized+degraded, acting [23,1]
>
> recovery 389/9308133 objects degraded (0.004%)
>
>
>
> # ceph pg dump_stuck degraded
>
> ok
>
> pg_stat  state      up           up_primary          acting
> acting_primary
>
> 24.17     active+undersized+degraded          [23,1]     23           [23,1]
> 23
>
> 24.54a   active+undersized+degraded          [8,19]     8
> [8,19]     8
>
>
>
> # ceph pg map 24.17
>
> osdmap e105161 pg 24.17 (24.17) -> up [23,1] acting [23,1]
>
> # ceph pg map 24.54a
>
> osdmap e105161 pg 24.54a (24.54a) -> up [8,19] acting [8,19]
>
>
>
> The osd tree and crushmap can be found here: http://pastebin.com/i4BQq5Mi
>
>
>
> I’m hoping for some insight into why this is happening. I couldn’t find much
> out there on the net about undersized pg states other than that there are
> people trying to get a replication of 3 with less than 3 osd’s or less than
> 3 hosts when a host level has been specified in the crushmap hierarchy but
> that doesn’t apply here.
>
>
>
> Best regards,
>
>
>
>
>
> Max
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
Gaurav Bafna
9540631400
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com