Re: Using bitmaps to accelerate fetch and clone

Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> · Mon, 1 Oct 2012 19:48:41 +0700

On Mon, Oct 1, 2012 at 9:26 AM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> One of the more troublesome problems is building the bitmaps is
> difficult from a streaming processor like index-pack. You need the
> reachability graph for all objects, which is not currently produced
> when moving data over the wire. We do an fsck after-the-fact to verify
> we didn't get corrupt data, but this is optional and currently after
> the pack is stored. We need to refactor this code to run earlier to
> get the bitmap built. If we take Peff's idea and put the bitmap data
> into a new stream rather than the pack-*.idx file we can produce the
> bitmap at the same time as the fsck check, which is probably a simpler
> change.

If we need to go through the whole pack, not random sha-1 access, then
index-pack's traversal is more efficient. I have some patches that
remove pack-check.c and make fsck use index-pack to walk through
packs. It takes much less time. But rev walk for building bitmaps
probably does not fit this style of traversal because rev walk does
not align with delta walk.

> Defining the pack's "edge" as a list of SHA-1s not in this pack but
> known to be required allows us to compute that leaf root tree
> reachability once, and never consider parsing it again. Which saves
> servers that host frequently accessed Git repositories but aren't
> repacking all of the time. (FWIW we repack frequently, I hear GitHub
> does too, because a fully repacked repository serves clients better
> than a partially packed one.)

Probably off topic. Does saving a list of missing bases in the pack
index help storing thin packs directly? I may be missing some points
because I don't see why thin packs cannot be stored on disk in the
first place.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html