Jeff King <peff@xxxxxxxx> writes: > On Fri, Oct 11, 2024 at 03:54:51AM -0400, Jeff King wrote: > >> [1] I wish we had good names to distinguish the various cases, because >> the term "reuse" is kind of overloaded. The "slower" regular >> object-sending path may still reuse verbatim bytes found in an >> on-disk path. But this "blit out matching parts of a pack without >> otherwise considering the objects" feature happens outside of that. >> We called it "pack reuse" back in 2013, but that was not a good name >> even then. I don't have a good suggestion, though. > > Actually, confusing things more, there are really _three_ layers of > reuse: > > 1. At the beginning of a pack, we can blit out the bytes for objects > starting from the beginning of the pack that are being sent (we > know any delta will be satisfied since its base comes before it). Yes, I wouldn't be worried about that one. The data encoded as an ofs-delta in this section already point at their base correctly in the original pack, and in the resulting pack. > 2. After that, we process objects one by one, but do so very cheaply > by just deciding if we can blit them out one by one, fixing up > delta base offsets to account for gaps. This is the part I said "we have to remember where the base was emitted and subtract it from the offset of the delta anyway even if we are reusing delta from the same pack, so what do we need a separate code path for this?" in my initial response. I guess, "fixing up" could be done by using the difference between offsets in the original pack for this step, which would be an unfortunate design that prevents it from getting reused. > 3. Otherwise, we generate an object_entry struct in packing_data for > them, try to find new deltas, and so on. We may then reuse the > on-disk bytes after deciding they're suitable. It is a bit unfortunate, if we were to trust the existing delta base selection in the pack like we did since Feb 2006 [*], we should be omitting the "try to find new deltas" step. Perhaps that comes for free as the object_entry knows that our object has a delta_base? > We call all of these "reuse", and certainly both (1) and (2) are "pack > reuse", but I think that term is sufficiently vague that it could apply > to (3) as well. > > -Peff