Re: objects degraded higher than 100%

Darius Kasparavičius <daznis@xxxxxxxxx> · Wed, 6 Mar 2019 17:26:40 +0200



Hi,

there it's 1.2% not 1200%.

On Wed, Mar 6, 2019 at 4:36 PM Simon Ironside <sironside@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I'm still seeing this issue during failure testing of a new Mimic 13.2.4
> cluster. To reproduce:
>
> - Working Mimic 13.2.4 cluster
> - Pull a disk
> - Wait for recovery to complete (i.e. back to HEALTH_OK)
> - Remove the OSD with `ceph osd crush remove`
> - See greater than 100% degraded objects while it recovers as below
>
> It doesn't seem to do any harm, once recovery completes the cluster
> returns to HEALTH_OK.
> I can only find bug 21803 on the tracker that seems to cover this
> behaviour which is marked as resolved.
>
> Simon
>
>    cluster:
>      id:     MY ID
>      health: HEALTH_WARN
>              709/58572 objects misplaced (1.210%)
>              Degraded data redundancy: 90094/58572 objects degraded
> (153.818%), 49 pgs degraded, 51 pgs undersized
>
>    services:
>      mon: 3 daemons, quorum san2-mon1,san2-mon2,san2-mon3
>      mgr: san2-mon1(active), standbys: san2-mon2, san2-mon3
>      osd: 52 osds: 52 up, 52 in; 84 remapped pgs
>
>    data:
>      pools:   16 pools, 2016 pgs
>      objects: 19.52 k objects, 72 GiB
>      usage:   7.8 TiB used, 473 TiB / 481 TiB avail
>      pgs:     90094/58572 objects degraded (153.818%)
>               709/58572 objects misplaced (1.210%)
>               1932 active+clean
>               47   active+recovery_wait+undersized+degraded+remapped
>               33   active+remapped+backfill_wait
>               2    active+recovering+undersized+remapped
>               1    active+recovery_wait+undersized+degraded
>               1    active+recovering+undersized+degraded+remapped
>
>    io:
>      client:   24 KiB/s wr, 0 op/s rd, 3 op/s wr
>      recovery: 0 B/s, 126 objects/s
>
>
> On 13/10/2017 18:53, David Zafman wrote:
> >
> > I improved the code to compute degraded objects during
> > backfill/recovery.  During my testing it wouldn't result in percentage
> > above 100%.  I'll have to look at the code and verify that some
> > subsequent changes didn't break things.
> >
> > David
> >
> >
> > On 10/13/17 9:55 AM, Florian Haas wrote:
> >>>>> Okay, in that case I've no idea. What was the timeline for the
> >>>>> recovery
> >>>>> versus the rados bench and cleanup versus the degraded object counts,
> >>>>> then?
> >>>> 1. Jewel deployment with filestore.
> >>>> 2. Upgrade to Luminous (including mgr deployment and "ceph osd
> >>>> require-osd-release luminous"), still on filestore.
> >>>> 3. rados bench with subsequent cleanup.
> >>>> 4. All OSDs up, all  PGs active+clean.
> >>>> 5. Stop one OSD. Remove from CRUSH, auth list, OSD map.
> >>>> 6. Reinitialize OSD with bluestore.
> >>>> 7. Start OSD, commencing backfill.
> >>>> 8. Degraded objects above 100%.
> >>>>
> >>>> Please let me know if that information is useful. Thank you!
> >>>
> >>> Hmm, that does leave me a little perplexed.
> >> Yeah exactly, me too. :)
> >>
> >>> David, do we maybe do something with degraded counts based on the
> >>> number of
> >>> objects identified in pg logs? Or some other heuristic for number of
> >>> objects
> >>> that might be stale? That's the only way I can think of to get these
> >>> weird
> >>> returning sets.
> >> One thing that just crossed my mind: would it make a difference
> >> whether after the OSD goes out or not, in the time window between it
> >> going down and being deleted from the crushmap/osdmap? I think it
> >> shouldn't (whether being marked out or just non-existent, it's not
> >> eligible for holding any data so either way), but I'm not really sure
> >> about the mechanics of the internals here.
> >>
> >> Cheers,
> >> Florian
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com