On Wed, Oct 30, 2013 at 3:47 PM, Vicent Marti <vicent@xxxxxxxxxx> wrote: > On Wed, Oct 30, 2013 at 9:10 AM, Jeff King <peff@xxxxxxxx> wrote: >> >> In fact, I'm not quite sure that even a partial reuse up to an offset is >> 100% safe. In a newly packed git repo it is, because we always put bases >> before deltas (and OFS_DELTA objects need this). But if you had a bitmap >> generated from a fixed thin pack, we would have REF_DELTA objects early >> on that depend on bases appended to the end of the pack. So I really >> wonder if we should scrap this partial reuse and either just have full >> reuse, or go through the regular object_entry construction. >> >> Vicent, you've thought about the reuse code a lot more than I have. Any >> thoughts? > > Yes, our pack writing and bitmap code takes enough precautions to > arrange the objects in the packfile in a way that can be partially > reused, so for any given bitmap file written from Git, I'd say we're > safe to always reuse the leader of the pack if this is possible. > > For bitmaps generated from JGit, however, we cannot make this > assumption. I mean, we can right now (from my understanding of the > current implementation for pack-objects on JGit), but they are free to > change this in the future. JGit certainly doesn't promise the ordering behavior, so the fact that its happening is just luck. The code could change in the future to invalidate this. > Obviously I intend to keep the pack reuse on production because the > CPU savings are noticeable, but we can drop it from the public > patchset. I think you should keep it in, its a significant improvement. > Ideally, we'd have full pack reuse like JGit, but we cannot > reasonably do that in GitHub because splitting a pack for the network > root would double our disk usage for all the forks. I gave a talk the week before about Git bitmaps and why we sometimes have to slice pack files by object. Some guy in the audience kept yelling that since its Git its all open source and `git clone` is "just" a file transfer problem. So maybe for his GitHub repositories and forks its OK to include the entire fork network when someone clones? :-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html