Question regarding degraded PGs

Goncalo Borges <goncalo@xxxxxxxxxxxxxxxxxxx> · Thu, 27 Aug 2015 11:54:58 +1000



    Hey guys...

    
    1./ I have a simple question regarding the appearance of degraded
    PGs. First, for reference:

    a. I am working with 0.94.2

      
      b. I have 32 OSDs distributed in 4 servers, meaning that I have 8
      OSD per server.

      
      c. Our cluster is set with 'osd pool default size = 3' and 'osd
      pool default min size = 2'

    
    2./ I am testing the cluster in several disaster catastrophe
    scenarios, and I've deliberately powered down a storage server, with
    its 8 OSDs. At this point, everything went fine: during the night,
    the cluster performed all the recovery I/O, and in the morning, I
    got a 'HEALTH_OK' cluster running in only 3 servers and 24 OSDs.

    
    3./ I've now powered up the missing server, and as expected, the
    cluster enters in  'HEALTH_WARN' and adjusts itself to the presence
    of one more server and 8 more populated OSDs. 

    
    4. However, what I do not understand is why during the former
    process, there are some PGs reported as degraded. Check the ' ceph
    -s' output next. As far as i understand, degraded PGs means that
    ceph has not replicated some objects in the placement group the
    correct number of times yet. This is actually not the case because,
    if we started from a 'HEALTH_OK situation' it means all PGs are
    coherent. What does it happens under the cover when this new server
    (and its populated 8 OSDS) rejoins the cluster that triggers the
    existence of degraded PGs?

    
    # ceph -s

        cluster eea8578f-b3ac-4dfb-a0c5-da40509f5cdc

         health HEALTH_WARN

                115 pgs backfill

                121 pgs backfilling

                513 pgs degraded

                31 pgs recovering

                309 pgs recovery_wait

                513 pgs stuck degraded

                576 pgs stuck unclean

                recovery 198838/8567132 objects degraded (2.321%)

                recovery 3267325/8567132 objects misplaced (38.138%)

         monmap e1: 3 mons at
    {mon1=X.X.X.X:6789/0,mon2=X.X.X.X.34:6789/0,mon3=X.X.X.X:6789/0}

                election epoch 24, quorum 0,1,2 mon1,mon3,mon2

         mdsmap e162: 1/1/1 up {0=rccephmds=up:active}, 1
    up:standby-replay

         osdmap e4764: 32 osds: 32 up, 32 in; 555 remapped pgs

          pgmap v1159567: 2176 pgs, 2 pools, 6515 GB data, 2240 kobjects

                22819 GB used, 66232 GB / 89051 GB avail

                198838/8567132 objects degraded (2.321%)

                3267325/8567132 objects misplaced (38.138%)

                    1600 active+clean

                     292 active+recovery_wait+degraded+remapped

                     113 active+degraded+remapped+backfilling

                      60 active+degraded+remapped+wait_backfill

                      55 active+remapped+wait_backfill

                      27 active+recovering+degraded+remapped

                      17 active+recovery_wait+degraded

                       8 active+remapped+backfilling

                       4 active+recovering+degraded

    recovery io 521 MB/s, 170 objects/s

    
    Cheers

    Goncalo

    
    -- 
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW  2006
T: +61 2 93511937

  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com