On Mon, Oct 1, 2012 at 9:26 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote: > One of the more troublesome problems is building the bitmaps is > difficult from a streaming processor like index-pack. You need the > reachability graph for all objects, which is not currently produced > when moving data over the wire. We do an fsck after-the-fact to verify > we didn't get corrupt data, but this is optional and currently after > the pack is stored. We need to refactor this code to run earlier to > get the bitmap built. If we take Peff's idea and put the bitmap data > into a new stream rather than the pack-*.idx file we can produce the > bitmap at the same time as the fsck check, which is probably a simpler > change. If we need to go through the whole pack, not random sha-1 access, then index-pack's traversal is more efficient. I have some patches that remove pack-check.c and make fsck use index-pack to walk through packs. It takes much less time. But rev walk for building bitmaps probably does not fit this style of traversal because rev walk does not align with delta walk. > Defining the pack's "edge" as a list of SHA-1s not in this pack but > known to be required allows us to compute that leaf root tree > reachability once, and never consider parsing it again. Which saves > servers that host frequently accessed Git repositories but aren't > repacking all of the time. (FWIW we repack frequently, I hear GitHub > does too, because a fully repacked repository serves clients better > than a partially packed one.) Probably off topic. Does saving a list of missing bases in the pack index help storing thin packs directly? I may be missing some points because I don't see why thin packs cannot be stored on disk in the first place. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html