Re: Question about how rebuild works.

Gregory Farnum <gfarnum@xxxxxxxxxx> · Fri, 6 Nov 2015 08:53:10 -0800



Yeah, I'm more concerned about individual object durability. This
seems like a good way (in ongoing flapping or whatever) for objects at
the tail end of a PG to never get properly replicated even as we
expend lots of IO repeatedly recovering earlier objects which are
better-replicated. :/ Perhaps min_size et al make this a moot point,
but...I don't think so. Haven't worked it all the way through.
-Greg

On Fri, Nov 6, 2015 at 8:48 AM, Samuel Just <sjust@xxxxxxxxxx> wrote:
> Nope, it's worse, there could be arbitrary portions of backfilled and
> unbackfilled portions on any particular incomplete osd.  We'd need a
> backfilled_regions field with a type like map<hobject_t, hobject_t>
> mapping backfilled regions begin->end.  It's pretty tedious, but
> doable provided that we bound how large the mapping gets.  I'm
> skeptical about how large an effect this would actually have on
> overall durability (how frequent is this case?).  Once Allen does the
> math, we'll have a better idea :)
> -Sam
>
> On Fri, Nov 6, 2015 at 8:43 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>> Argh, I guess I was wrong. Sorry for the misinformation, all! :(
>>
>> If we were to try and do this, Sam, do you have any idea how much it
>> would take? Presumably we'd have to add a backfill_begin marker to
>> bookend with last_backfill_started, and then everywhere we send over
>> object ops we'd have to compare against both of those values. But I'm
>> not sure how many sites that's likely to be, what other kinds of paths
>> rely on last_backfill_started, or if I'm missing something.
>> -Greg
>>
>> On Fri, Nov 6, 2015 at 8:30 AM, Samuel Just <sjust@xxxxxxxxxx> wrote:
>>> What it actually does is rebuild 3 until it catches up with 2 and then
>>> it rebuilds them in parallel (to minimize reads).  Optimally, we'd
>>> start 3 from where 2 left off and then circle back, but we'd have to
>>> complicate the metadata we use to track backfill.
>>> -Sam
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html