Re: Fwd: how io works when backfill

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



if add in osd.7 and 7 becomes the primary: pg1.0 [1, 2, 3]  --> pg1.0
[7, 2, 3],  is it similar with the example above?
still install a pg_temp entry mapping the PG back to [1, 2, 3], then
backfill happens to 7, normal io write to [1, 2, 3], if io to the
portion of the PG that has already been backfilled will also be sent
to osd.7?

how about these examples about removing an osd:
- pg1.0 [1, 2, 3]
- osd.3 down and be removed
- mapping changes to [1, 2, 5], but osd.5 has no data, then install a
pg_temp mapping the PG back to [1, 2], then backfill happens to 5,
- normal io write to [1, 2], if io hits object which has been
backfilled to osd.5, io will also send to osd.5
- when backfill completes, remove the pg_temp and mapping changes back
to [1, 2, 5]


another example:
- pg1.0 [1, 2, 3]
- osd.3 down and be removed
- mapping changes to [5, 1, 2], but osd.5 has no data of the pg, then
install a pg_temp mapping the PG back to [1, 2] which osd.1
temporarily becomes the primary, then backfill happens to 5,
- normal io write to [1, 2], if io hits object which has been
backfilled to osd.5, io will also send to osd.5
- when backfill completes, remove the pg_temp and mapping changes back
to [5, 1, 2]

is my ananysis right?

2015-12-29 1:30 GMT+08:00 Sage Weil <sage@xxxxxxxxxxxx>:
> On Mon, 28 Dec 2015, Zhiqiang Wang wrote:
>> 2015-12-27 20:48 GMT+08:00 Dong Wu <archer.wudong@xxxxxxxxx>:
>> > Hi,
>> > When add osd or remove osd, ceph will backfill to rebalance data.
>> > eg:
>> > - pg1.0    [1, 2, 3]
>> > - add an osd(eg. osd.7)
>> > - ceph start backfill, then pg1.0 osd set changes to [1, 2, 7]
>> > - if [a, b, c, d, e] are objects needing to backfill to osd.7 and now
>> > object a is backfilling
>> > - when a write io hits object a, then the io needs to wait for its
>> > complete, then goes on.
>> > - but if io hits object b which has not been backfilled, io reaches
>> > osd.1, then osd.1 send the io to osd.2  and osd.7, but osd.7 does not
>> > have object b, so osd.7 needs to wait for object b to backfilled, then
>> > write. Is it right? Or osd.1 only send the io to osd.2, not both?
>>
>> I think in this case, when the write of object b reaches osd.1, it
>> holds the client write, raises the priority of the recovery of object
>> b, and kick off the recovery of it. When the recovery of object b is
>> done, it requeue the client write, and then everything goes like
>> usual.
>
> It's more complicated than that.  In a normal (log-based) recovery
> situation, it is something like the above: if the acting set is [1,2,3]
> but 3 is missing the latest copy of A, a write to A will block on the
> primary while the primary initiates recovery of A immediately.  Once that
> completes the IO will continue.
>
> For backfill, it's different.  In your example, you start with [1,2,3]
> then add in osd.7.  The OSD will see that 7 has no data for teh PG and
> install a pg_temp entry mapping the PG back to [1,2,3] temporarily.  Then
> things will proceed normally while backfill happens to 7.  Backfill won't
> interfere with normal IO at all, except that IO to the portion of the PG
> that has already been backfilled will also be sent to the backfill target
> (7) so that it stays up to date.  Once it complets, the pg_temp entry is
> removed and the mapping changes back to [1,2,7].  Then osd.3 is allowed to
> remove it's copy of the PG.
>
> sage
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux