Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > Shawn put together a new version of his import utility that packs all > of the deltas from a run into a single blob instead of one blob per > delta. The idea is to put 10 or more deltas into each delta entry > instead of one. The index format would map the 10 sha1's to a single > packed delta entry which would be expanded when needed. Note that you > probably needed multiple entries out of the delta pack to generate the > revision you were looking for so this is no real loss on extraction. > > I ran it overnight on mozcvs. If his delta pack code is correct this > is a huge win. > > One entry per delta - 845,42,0150 > Packed deltas - 295,018,474 > 65% smaller > > The effect of packing the deltas is to totally eliminate many of the > redundant zlib dictionaries. I'm going to try to integrate this into core GIT this weekend. My current idea is to make use of the OBJ_EXT type flag to add an extended header field behind the length which describes the "chunk" as being a delta chain compressed in one zlib stream. I'm not overly concerned about saving lots of space in the header here as it looks like we're winning a huge amount of pack space, so the extended header will probably itself be a couple of bytes. This keeps the shorter reserved types free for other great ideas. :) My primary goal of integrating it into core GIT is to take advantage of verify-pack to check the file fast-import is producing. Plus having support for it in sha1_file.c will make it easier to performance test the common access routines that need to be fast, like commit and tree walking. My secondary goal is to get a patchset which other folks can try on their own workloads to see if its as effective as what Jon is seeing on the Mozilla archive. Unfortunately I can't think of a way to make this type of pack readable by older software. So this could be creating a pretty big change in the pack format, relatively speaking. :) -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html