On Fri, Sep 29, 2017 at 5:19 PM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: > such as transaction a would modify the object a < last_backfill, so > transaction_applied would be true. Before transaction a is completed, > the transaction b which modify the object b > last_backfill, > so transaction_applied would be false, the current logic would > roll_forward which including object a and b? is it right? Yes, I believe that's the case. It's just that we don't care very much — if we copied the data while backfilling, we know that our source peer has the rollback state. Keep in mind that we only have rollback so that we can avoid the "RAID write hole" — eg, if we manage to write down an update on 4 nodes in a 5+3 erasure code, we can recover neither the old nor new data if it was written in-place. So we keep rollback data in that case and everybody goes back to the previous state. I *think* that if we manage to backfill an object for a particular shard, then we know that we can roll forward on it anyway or the read would have failed and the OSDs would have already rolled back. But I didn't check that. Certainly doing something other than this automatic roll forward would require a lot more bookkeeping that would make everything else going on more difficult. -Greg > > > 2017-09-30 2:27 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: >> On Fri, Sep 29, 2017 at 3:02 AM, Xinze Chi (信泽) <xmdxcxz@xxxxxxxxx> wrote: >>> hi, all >>> >>> I confuse the roll_forward logic in the PG::append_log. The >>> pg_log.roll_forward func may roll forward the all inflight >>> transactions which maybe not be completed by all shards. >>> >>> The comment also makes me wonder. so could anyone explain it in >>> detail. thanks. >>> >>> >>> if (!transaction_applied) { >>> /* We must be a backfill peer, so it's ok if we apply >>> * out-of-turn since we won't be considered when >>> * determining a min possible last_update. >>> */ >>> pg_log.roll_forward(&handler); >>> } >>> >>> /* We don't want to leave the rollforward artifacts around >>> * here past last_backfill. It's ok for the same reason as >>> * above */ >>> if (transaction_applied && >>> p->soid > info.last_backfill) { >>> pg_log.roll_forward(&handler); >>> } >> >> transaction_applied can only be false if we are being backfilled. If >> we are being backfilled, we may not *have* the older data that we >> would rollback to, and our peers don't rely on us having that data. So >> there's no point in our trying to keep rollback data around, and >> keeping it around would mean finding a way to clean it up later. Thus, >> delete it now. >> -Greg > > > > -- > Regards, > Xinze Chi -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html