so we could roll forward no matter object > last_backfill or < last_backfill, as long as it is backfill target? If so, we could set backfill in ECSubWrite true if it is backfill target? 2017-10-05 2:11 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: > On Fri, Sep 29, 2017 at 5:19 PM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: >> such as transaction a would modify the object a < last_backfill, so >> transaction_applied would be true. Before transaction a is completed, >> the transaction b which modify the object b > last_backfill, >> so transaction_applied would be false, the current logic would >> roll_forward which including object a and b? is it right? > > Yes, I believe that's the case. It's just that we don't care very much > — if we copied the data while backfilling, we know that our source > peer has the rollback state. Keep in mind that we only have rollback > so that we can avoid the "RAID write hole" — eg, if we manage to write > down an update on 4 nodes in a 5+3 erasure code, we can recover > neither the old nor new data if it was written in-place. So we keep > rollback data in that case and everybody goes back to the previous > state. > > I *think* that if we manage to backfill an object for a particular > shard, then we know that we can roll forward on it anyway or the read > would have failed and the OSDs would have already rolled back. But I > didn't check that. Certainly doing something other than this automatic > roll forward would require a lot more bookkeeping that would make > everything else going on more difficult. > -Greg > >> >> >> 2017-09-30 2:27 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: >>> On Fri, Sep 29, 2017 at 3:02 AM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: >>>> hi, all >>>> >>>> I confuse the roll_forward logic in the PG::append_log. The >>>> pg_log.roll_forward func may roll forward the all inflight >>>> transactions which maybe not be completed by all shards. >>>> >>>> The comment also makes me wonder. so could anyone explain it in >>>> detail. thanks. >>>> >>>> >>>> if (!transaction_applied) { >>>> /* We must be a backfill peer, so it's ok if we apply >>>> * out-of-turn since we won't be considered when >>>> * determining a min possible last_update. >>>> */ >>>> pg_log.roll_forward(&handler); >>>> } >>>> >>>> /* We don't want to leave the rollforward artifacts around >>>> * here past last_backfill. It's ok for the same reason as >>>> * above */ >>>> if (transaction_applied && >>>> p->soid > info.last_backfill) { >>>> pg_log.roll_forward(&handler); >>>> } >>> >>> transaction_applied can only be false if we are being backfilled. If >>> we are being backfilled, we may not *have* the older data that we >>> would rollback to, and our peers don't rely on us having that data. So >>> there's no point in our trying to keep rollback data around, and >>> keeping it around would mean finding a way to clean it up later. Thus, >>> delete it now. >>> -Greg >> >> >> >> -- >> Regards, >> Xinze Chi -- Regards, Xinze Chi -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html