Re: why are there "degraded" PGs when adding OSDs?

Chad William Seys <cwseys@xxxxxxxxxxxxxxxx> · Mon, 27 Jul 2015 16:09:02 -0500

Hi Sam,
I think I may have the problem:  I noticed that the new host was created with 
straw2 instead of straw.  Would this account for 50% of PGs being degraded?

(I'm removing the OSDs on that host and will recreate with 'firefly' tunables.)

Thanks!
Chad.

On Monday, July 27, 2015 15:09:21 Chad William Seys wrote:
> Hi Sam,
> 	I'll need help getting the osdmap and pg dump prior to addition.
> 	I can remove the OSDs and add again if the osdmap (etc.) is not logged
> somewhere.
> 
> Chad.
> 
> > Hmm, that's odd.  Can you attach the osdmap and ceph pg dump prior to the
> > addition (with all pgs active+clean), then the osdmap and ceph pg dump
> > afterwards? -Sam
> > 
> > ----- Original Message -----
> > From: "Chad William Seys" <cwseys@xxxxxxxxxxxxxxxx>
> > To: "Samuel Just" <sjust@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxx>
> > Sent: Monday, July 27, 2015 12:57:23 PM
> > Subject: Re:  why are there "degraded" PGs when adding OSDs?
> > 
> > Hi Sam,
> > 
> > > The pg might also be degraded right after a map change which changes the
> > > up/acting sets since the few objects updated right before the map change
> > > might be new on some replicas and old on the other replicas.  While in
> > > that
> > > state, those specific objects are degraded, and the pg would report
> > > degraded until they are recovered (which would happen asap, prior to
> > > backfilling the new replica). -Sam
> > 
> > That sounds like only a few PGs should be degraded.  I instead have about
> > 45% (and higher earlier).
> > 
> > # ceph -s
> > 
> >     cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6
> >     
> >      health HEALTH_WARN
> >      
> >             2081 pgs backfill
> >             6745 pgs degraded
> >             17 pgs recovering
> >             6728 pgs recovery_wait
> >             6745 pgs stuck degraded
> >             8826 pgs stuck unclean
> >             recovery 2530124/5557452 objects degraded (45.527%)
> >             recovery 33594/5557452 objects misplaced (0.604%)
> >      
> >      monmap e5: 3 mons at
> > 
> > {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=10.128.19
> > 8. 51:6789/0} election epoch 16458, quorum 0,1,2 mon03,mon01,mon02
> > 
> >      mdsmap e3032: 1/1/1 up {0=mds01.hep.wisc.edu=up:active}
> >      osdmap e149761: 27 osds: 27 up, 27 in; 2083 remapped pgs
> >      
> >       pgmap v13464928: 18432 pgs, 9 pools, 5401 GB data, 1364 kobjects
> >       
> >             11122 GB used, 11786 GB / 22908 GB avail
> >             2530124/5557452 objects degraded (45.527%)
> >             33594/5557452 objects misplaced (0.604%)
> >             
> >                 9606 active+clean
> >                 6726 active+recovery_wait+degraded
> >                 2081 active+remapped+wait_backfill
> >                 
> >                   17 active+recovering+degraded
> >                   
> >                    2 active+recovery_wait+degraded+remapped
> > 
> > recovery io 24861 kB/s, 6 objects/s
> > 
> > Chad.
> > 
> > > ----- Original Message -----
> > > From: "Chad William Seys" <cwseys@xxxxxxxxxxxxxxxx>
> > > To: "ceph-users" <ceph-users@xxxxxxxx>
> > > Sent: Monday, July 27, 2015 12:27:26 PM
> > > Subject:  why are there "degraded" PGs when adding OSDs?
> > > 
> > > Hi All,
> > > 
> > > I recently added some OSDs to the Ceph cluster (0.94.2). I noticed that
> > > 'ceph -s' reported both misplaced AND degraded PGs.
> > > 
> > > Why should any PGs become degraded?  Seems as though Ceph should only be
> > > reporting misplaced PGs?
> > > 
> > > From the Giant release notes:
> > > Degraded vs misplaced: the Ceph health reports from ‘ceph -s’ and
> > > related
> > > commands now make a distinction between data that is degraded (there are
> > > fewer than the desired number of copies) and data that is misplaced
> > > (stored
> > > in the wrong location in the cluster). The distinction is important
> > > because
> > > the latter does not compromise data safety.
> > > 
> > > Does Ceph delete some replicas of the PGs (leading to degradation)
> > > before
> > > re- replicating on the new OSD?
> > > 
> > > This does not seem to be the safest algorithm.
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com