Thanks Sage. That makes sense. Just would like to confirm my understanding: Let's say the map for the PG changes from [1, 2, 3] to [1, 2, 4] and 4 is the backfill target (4 may or may not have the info for the PG, which should not matter as long as it is not eligible for log based recovery), here the PG's last_backfill and backfill_info.begin would reset to MIN, same for 4, and the flow would be something like: 1. 1 scan_range starting from backfill_info.begin for a configured number of entries 2. 1 query the digest from 4 for the range 3. Once the digest is back from 4, fill the to_push and to_remove list by comparing the local and remote entries (here we also consider the PG log to capture most recent updates) 4. If either to_push or to_remove is not empty, push the objects (or remove) to the backfill target. 5. Move the lack_backfill pointer and continue with step 1. Thanks, Guang ---------------------------------------- > Date: Fri, 11 Sep 2015 05:57:42 -0700 > From: sage@xxxxxxxxxxxx > To: yguang11@xxxxxxxxxxx > CC: ceph-devel@xxxxxxxxxxxxxxx; sjust@xxxxxxxxxx > Subject: Re: Backfill > > On Thu, 10 Sep 2015, GuangYang wrote: >> Today I played around recovery and backfill of a Ceph cluster (by >> manually bringing some OSDs down/out), and got one question regards to >> the current flow: >> >> Does backfill push everything to the backfill target regardless what the >> backfill target already has? The scenario is like - acting set of the PG >> is [1, 2, 3], and 3 went down (at which point it already had some data) >> and stayed down for a sustained period (but not marked out), during >> which time there were sustained WRITE to the PG. At some point 3 went >> back up, and it is not sufficient to recovery via PG log, so the PG >> needed to be backfilled and 3 is the target. Does 1 needs to push >> everything (last_backfill starts with MIN) to 3? It seems so to me as I >> don't see some round trip to negotiate what each OSD has and do an >> incremental push (as recovery does), but it would be nice to get confirm >> :) > > No. Backfill iterates over objects on the source and destination and > only pushes objects that are missing or out of date (and deletes ones > that shouldn't be there). This is all in ReplicatedPG::recover_backfill() > (though it's not the easiest read). > > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html