Linus Torvalds <torvalds@xxxxxxxx> wrote: > > > On Thu, 20 Apr 2006, Shawn Pearce wrote: > > > > So with 1.3.0.g56c1 "git repack -a -d -f" did worse: > > > > Total 46391, written 46391 (delta 6649), reused 39742 (delta 0) > > 129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack > > > > I just tried -f on v1.2.3 and it did slightly better then before: > > > > Total 46391, written 46391 (delta 6847), reused 38012 (delta 0) > > 59M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack Oddly enough repacking the v1.2.3 pack using 1.3.0.g56c1 created an even smaller pack ("git-repack -a -d"): Total 46391, written 46391 (delta 8253), reused 44985 (delta 6847) 49M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack and repacking again with "git-repack -a -d" chopped another 1M: Total 46391, written 46391 (delta 8258), reused 46386 (delta 8253) 48M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pac but then adding -f definately gives us the 2x explosion again: Total 46391, written 46391 (delta 6649), reused 37894 (delta 0) 129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack > Interesting. The bigger packs do generate fewer deltas, but they don't > seem to be _that_ much fewer. And the deltas themselves certainly > shouldn't be bigger. > > It almost sounds like there's a problem with choosing what to delta > against, not necessarily a delta algorithm problem. Although that sounds a > bit strange, because I wouldn't have thought we actually changed the > packing algorithm noticeably since 1.2.3. > > Hmm. Doing "gitk v1.2.3.. -- pack-objects.c" shows that I was wrong. Junio > did the "hash basename and direname a bit differently" thing, which would > appear to change the "find objects to delta against" a lot. That could be > it. > > You could try to revert that change: > > git revert eeef7135fed9b8784627c4c96e125241c06c65e1 > > which needs a trivial manual fixup (remove the conflict entirely: > everything between the "<<<<" and ">>>>>" lines should go), and see if > that's it. Whoa. I did that revert and fixup on top of 'next'. The pack from "git-repack -a -d -f" is now even larger due to even less delta reuse: Total 46391, written 46391 (delta 5148), reused 39565 (delta 0) 171M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack > You can also try to see if > > git repack -a -d -f --window=50 > > makes for a better pack (at the cost of a much slower repack). It makes > git try more objects to delta against, and can thus hide a bad sort order. With --window=50 on 'next' (without the revert'): Total 46391, written 46391 (delta 6666), reused 39723 (delta 0) 129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack For added measure I tried --window=100 and 500 with pretty much the same result (slightly higher delta but still a 129M pack). -- Shawn. - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html