Nicolas Pitre <nico@xxxxxxx> wrote: > On Sun, 27 Aug 2006, Shawn Pearce wrote: > > > I'm going to try to get tree deltas written to the pack sometime this > > week. That should compact this intermediate pack down to something > > that git-pack-objects would be able to successfully mmap into a > > 32 bit address space. A complete repack with no delta reuse will > > hopefully generate a pack closer to 400 MB in size. But I know > > Jon would like to get that pack even smaller. :) > > One thing to consider in your code (if you didn't implement that > already) is to _not_ attempt any delta on any object whose size is > smaller than 50 bytes, and then limit the maximum delta size to > object_size/2 - 20 (use that for the last argument to diff-delta() and > store the undeltified object when diff-delta returns NULL). This way > you'll avoid creating delta objects that are most likely to end up being > _larger_ than the undeltified object. So I added Nico's suggestions to fast-import and ran it on a small subset of the Mozilla repository (3424 blobs): naive always delta: 6652 KiB Nico's suggestion: 6842 KiB So Nico's suggestion of limiting delta size to (orig_len/2)-20 or not using deltas on blobs < 50 bytes actually added 190 KB to the output pack. Since this sample is probably fairly representative of the rest of the repository's blobs I'm thinking we may see a 2.8% increase in size over the current 930 MB blob pack. That's another 26 MB in our intermediate pack. I don't think this suggestion is really worth including in fast-import right now... -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html