Re: ec overwrite issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 29, 2017 at 5:19 PM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote:
> such as transaction a would modify the object a < last_backfill, so
> transaction_applied would be true. Before transaction a is completed,
> the transaction b which modify the object b > last_backfill,
> so transaction_applied would be false, the current logic would
> roll_forward which including object a and b? is it right?

Yes, I believe that's the case. It's just that we don't care very much
— if we copied the data while backfilling, we know that our source
peer has the rollback state. Keep in mind that we only have rollback
so that we can avoid the "RAID write hole" — eg, if we manage to write
down an update on 4 nodes in a 5+3 erasure code, we can recover
neither the old nor new data if it was written in-place. So we keep
rollback data in that case and everybody goes back to the previous
state.

I *think* that if we manage to backfill an object for a particular
shard, then we know that we can roll forward on it anyway or the read
would have failed and the OSDs would have already rolled back. But I
didn't check that. Certainly doing something other than this automatic
roll forward would require a lot more bookkeeping that would make
everything else going on more difficult.
-Greg

>
>
> 2017-09-30 2:27 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>:
>> On Fri, Sep 29, 2017 at 3:02 AM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote:
>>> hi, all
>>>
>>>     I confuse the roll_forward logic in the PG::append_log. The
>>> pg_log.roll_forward func may roll forward the all inflight
>>> transactions which maybe not be completed by all shards.
>>>
>>>     The comment also makes me wonder. so could anyone explain it in
>>> detail. thanks.
>>>
>>>
>>>   if (!transaction_applied) {
>>>      /* We must be a backfill peer, so it's ok if we apply
>>>       * out-of-turn since we won't be considered when
>>>       * determining a min possible last_update.
>>>       */
>>>     pg_log.roll_forward(&handler);
>>>   }
>>>
>>>     /* We don't want to leave the rollforward artifacts around
>>>      * here past last_backfill.  It's ok for the same reason as
>>>      * above */
>>>     if (transaction_applied &&
>>>        p->soid > info.last_backfill) {
>>>       pg_log.roll_forward(&handler);
>>>     }
>>
>> transaction_applied can only be false if we are being backfilled. If
>> we are being backfilled, we may not *have* the older data that we
>> would rollback to, and our peers don't rely on us having that data. So
>> there's no point in our trying to keep rollback data around, and
>> keeping it around would mean finding a way to clean it up later. Thus,
>> delete it now.
>> -Greg
>
>
>
> --
> Regards,
> Xinze Chi
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux