On Tue, Oct 27, 2015 at 02:04:23AM +0000, Sivakumar Selvam wrote: > When I finished git repacking, I found 12 pack files with each 4 GB and > the total size is 48 GB. Again I ran the same git repack command by just > removing only --max-pack-size= parameter, the size of the single pack file > is 66 GB. > > git repack -A -b -d -q --depth=50 --window=10 abc.git > > Now, I see the total size of the single abc.git has become 66 GB. Initially > it was 34 GB, After using --max-pack-size=4g it become 48 GB. When we > remove the --max-pack-size=4g parameter and tried to create a single pack > file now it become 66 GB. > > Looks like once we do git repack with multiple pack files, we can't revert > back to the original size. Git tries to take some shortcuts when repacking: if two objects are in the same pack but not deltas, it will not consider making deltas out of them. The logic is we would already have tried that while making the original pack. But of course when you are doing weird things with the packing parameters, that is not always a good assumption. When doing experiments like this, add "-f" to your repack command-line to avoid reusing deltas. The result should be much smaller (at the expense of more CPU time to do the repack). I'd also recommend increasing "--window" if you can afford the extra CPU during the repack. It can often produce smaller packs. And it has less cost than you might think (e.g., window=20 is not twice as expensive as window=10, because the work to access the objects is cached). You can also increase --depth, but I have never found it to be particularly helpful for decreasing size[1]. -Peff [1] This is all theory, and I don't know how well git actually finds such deltas, but it is probably better to have a dense tree of deltas rather than long chains. If you have a chain of N objects and would to add object N+1 to it, you are probably not much worse off to base it on object N-1, creating a "fork" at N. The resulting objects should be less expensive to access for subsequent operations (as any time you want the Nth object, you have to resolve all parts of the chain, so shorter chains are better, and you the delta cache is more likely to get a hit on that N-1 object). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html