On Thu, Nov 5, 2015 at 9:59 PM, Allen Samuels <Allen.Samuels@xxxxxxxxxxx> wrote: > I have a question about rebuild in the following situation: > > I have a pool with 3x replication. > For one particular PG we'll designate the active OSD set as [1,2,3] with 1 as the primary. > Assume 2 and 3 crash with a TOTAL loss of local data. > 2 restarts, fiddles about and then start the backfill process. > <A little time passes, here little means enough to make some progress in the rebuild process but nowhere near enough to complete it. > 3 restarts, fiddles about then starts the backfill process. > > My question: Is there any optimization of the fact that we're rebuilding 2 OSDs at the same time (even though the rebuilds didn't start in lock step) OR do the two rebuilds continue independently?? I'm sure Sam will answer definitively, but I believe these are independent. We don't really have the tracking structures to do something intelligent and I'm not sure how we'd go about it — do you want them to fill in the same data together and then for OSD 3 to go back and fill in the beginning of the PG that it missed? For OSD 2 to stop getting data while we bring OSD 3 up to speed and then continue in lockstep? Something else I'm not thinking of that still makes use of the sharing? -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html