Ivan Tolstosheyev <ivan.tolstosheyev@xxxxxxxxx> writes: > #!/usr/bin/env bash > > git init test > cd test > for i in `seq 1 10000` > do > touch ${i} ; git add ${i} ; git commit -m "Add ${i}" ; > done > cd .. > du -hs test [...] > 180 MB!!!?? and 7.4M after `git gc` - thanks to delta compression! Most of those 180MB are waste from mostly unused 4KB (presumably) blocks of your filesystem. You should be looking at the post-gc'd numbers. Let's see the breakdown of 'du -h .git': 0 .git/rr-cache 1.5M .git/logs/refs/heads 1.5M .git/logs/refs 2.9M .git/logs 4.0K .git/objects/info 2.8M .git/objects/pack 2.8M .git/objects 0 .git/branches 12K .git/info 0 .git/remotes 88K .git/hooks 0 .git/refs/tags 0 .git/refs/heads 0 .git/refs 6.5M .git So 2.9MB are git keeping a reflog of everything we did (on HEAD and on master). Since merely storing a SHA1 for each of your 10000 operations already takes 200K, that's not so far off -- the factor of 10 is in the email, date and log message. In my case 704K went into the index (not directly visible above, it's the bulk of the top level). That's also not unreasonable: merely storing the object SHA1 (20 bytes) and a bunch of timestamps for 10000 files also gets you into the 500K ballpark. The pack index amazingly takes only about 500K, even though it is indexing 10000 trees and 10000 commits, so again the SHA1s alone get you into the 400K ballpark. That leaves only 2.3MB for the actual pack (which contains all the data!). But every commit must store a tree and a parent, so there are at least 2*10000*20 = 400K uncompressable bytes in the commits already[*]. So we are within a factor of 6 of just the data required to save the shape of your history DAG, no content included. I'd say that's not too bad. [*] This is not quite true, the parents and trees might be pointers within the pack. AFAIK the proposed pack v4 format does this, and would yield a more efficient compression. So if you're going to waste energy worrying about this, you should help with pack v4. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html