On Thu, Oct 12, 2017 at 10:52 AM Florian Haas <florian@xxxxxxxxxxx> wrote:
On Thu, Oct 12, 2017 at 7:22 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
>
> On Thu, Oct 12, 2017 at 3:50 AM Florian Haas <florian@xxxxxxxxxxx> wrote:
>>
>> On Mon, Sep 11, 2017 at 8:13 PM, Andreas Herrmann <andreas@xxxxxxxx>
>> wrote:
>> > Hi,
>> >
>> > how could this happen:
>> >
>> > pgs: 197528/1524 objects degraded (12961.155%)
>> >
>> > I did some heavy failover tests, but a value higher than 100% looks
>> > strange
>> > (ceph version 12.2.0). Recovery is quite slow.
>> >
>> > cluster:
>> > health: HEALTH_WARN
>> > 3/1524 objects misplaced (0.197%)
>> > Degraded data redundancy: 197528/1524 objects degraded
>> > (12961.155%), 1057 pgs unclean, 1055 pgs degraded, 3 pgs undersized
>> >
>> > data:
>> > pools: 1 pools, 2048 pgs
>> > objects: 508 objects, 1467 MB
>> > usage: 127 GB used, 35639 GB / 35766 GB avail
>> > pgs: 197528/1524 objects degraded (12961.155%)
>> > 3/1524 objects misplaced (0.197%)
>> > 1042 active+recovery_wait+degraded
>> > 991 active+clean
>> > 8 active+recovering+degraded
>> > 3 active+undersized+degraded+remapped+backfill_wait
>> > 2 active+recovery_wait+degraded+remapped
>> > 2 active+remapped+backfill_wait
>> >
>> > io:
>> > recovery: 340 kB/s, 80 objects/s
>>
>> Did you ever get to the bottom of this? I'm seeing something very
>> similar on a 12.2.1 reference system:
>>
>> https://gist.github.com/fghaas/f547243b0f7ebb78ce2b8e80b936e42c
>>
>> I'm also seeing an unusual MISSING_ON_PRIMARY count in "rados df":
>> https://gist.github.com/fghaas/59cd2c234d529db236c14fb7d46dfc85
>>
>> The odd thing in there is that the "bench" pool was empty when the
>> recovery started (that pool had been wiped with "rados cleanup"), so
>> the number of objects deemed to be missing from the primary really
>> ought to be zero.
>>
>> It seems like it's considering these deleted objects to still require
>> replication, but that sounds rather far fetched to be honest.
>
>
> Actually, that makes some sense. This cluster had an OSD down while (some
> of) the deletes were happening?
I thought of exactly that too, but no it didn't. That's the problem.
Okay, in that case I've no idea. What was the timeline for the recovery versus the rados bench and cleanup versus the degraded object counts, then?
> I haven't dug through the code but I bet it is considering those as degraded
> objects because the out-of-date OSD knows it doesn't have the latest
> versions on them! :)
Yeah I bet against that. :)
Another tidbit: these objects were not deleted with rados rm, they
were cleaned up after rados bench. In the case quoted above, this was
an explicit "rados cleanup" after "rados bench --no-cleanup"; in
another, I saw the same behavior after a regular "rados bench" that
included the automatic cleanup.
So there are two hypotheses here:
(1) The deletion in rados bench is neglecting to do something that a
regular object deletion does do. Given the fact that at least one
other thing is fishy in rados bench
(http://tracker.ceph.com/issues/21375), this may be due to some simple
oversight in the Luminous cycle, and thus would constitute a fairly
minor (if irritating) issue.
(2) Regular object deletion is buggy in some previously unknown
fashion. That would would be a rather major problem.
These both seem exceedingly unlikely. *shrug*
By the way, *deleting the pool* altogether makes the degraded object
count drop to expected levels immediately. Probably no surprise there,
though.
Cheers,
Florian
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com