Re: Question regarding degraded PGs

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 27 Aug 2015 13:02:38 +0100



On Thu, Aug 27, 2015 at 2:54 AM, Goncalo Borges
<goncalo@xxxxxxxxxxxxxxxxxxx> wrote:
> Hey guys...
>
> 1./ I have a simple question regarding the appearance of degraded PGs.
> First, for reference:
>
> a. I am working with 0.94.2
>
> b. I have 32 OSDs distributed in 4 servers, meaning that I have 8 OSD per
> server.
>
> c. Our cluster is set with 'osd pool default size = 3' and 'osd pool default
> min size = 2'
>
>
> 2./ I am testing the cluster in several disaster catastrophe scenarios, and
> I've deliberately powered down a storage server, with its 8 OSDs. At this
> point, everything went fine: during the night, the cluster performed all the
> recovery I/O, and in the morning, I got a 'HEALTH_OK' cluster running in
> only 3 servers and 24 OSDs.
>
> 3./ I've now powered up the missing server, and as expected, the cluster
> enters in  'HEALTH_WARN' and adjusts itself to the presence of one more
> server and 8 more populated OSDs.
>
> 4. However, what I do not understand is why during the former process, there
> are some PGs reported as degraded. Check the ' ceph -s' output next. As far
> as i understand, degraded PGs means that ceph has not replicated some
> objects in the placement group the correct number of times yet. This is
> actually not the case because, if we started from a 'HEALTH_OK situation' it
> means all PGs are coherent. What does it happens under the cover when this
> new server (and its populated 8 OSDS) rejoins the cluster that triggers the
> existence of degraded PGs?

Hmm. I too would expect those PGs to be reporting as "remapped" rather
than "degraded". And indeed they are all remapped in addition to being
degraded. Can you get the pg query for one of these degraded PGs and
post it to the list? Sam, do you expect this behavior?
-Greg

>
> # ceph -s
>     cluster eea8578f-b3ac-4dfb-a0c5-da40509f5cdc
>      health HEALTH_WARN
>             115 pgs backfill
>             121 pgs backfilling
>             513 pgs degraded
>             31 pgs recovering
>             309 pgs recovery_wait
>             513 pgs stuck degraded
>             576 pgs stuck unclean
>             recovery 198838/8567132 objects degraded (2.321%)
>             recovery 3267325/8567132 objects misplaced (38.138%)
>      monmap e1: 3 mons at
> {mon1=X.X.X.X:6789/0,mon2=X.X.X.X.34:6789/0,mon3=X.X.X.X:6789/0}
>             election epoch 24, quorum 0,1,2 mon1,mon3,mon2
>      mdsmap e162: 1/1/1 up {0=rccephmds=up:active}, 1 up:standby-replay
>      osdmap e4764: 32 osds: 32 up, 32 in; 555 remapped pgs
>       pgmap v1159567: 2176 pgs, 2 pools, 6515 GB data, 2240 kobjects
>             22819 GB used, 66232 GB / 89051 GB avail
>             198838/8567132 objects degraded (2.321%)
>             3267325/8567132 objects misplaced (38.138%)
>                 1600 active+clean
>                  292 active+recovery_wait+degraded+remapped
>                  113 active+degraded+remapped+backfilling
>                   60 active+degraded+remapped+wait_backfill
>                   55 active+remapped+wait_backfill
>                   27 active+recovering+degraded+remapped
>                   17 active+recovery_wait+degraded
>                    8 active+remapped+backfilling
>                    4 active+recovering+degraded
> recovery io 521 MB/s, 170 objects/s
>
> Cheers
> Goncalo
>
> --
> Goncalo Borges
> Research Computing
> ARC Centre of Excellence for Particle Physics at the Terascale
> School of Physics A28 | University of Sydney, NSW  2006
> T: +61 2 93511937
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com