neubyr <neubyr@xxxxxxxxx> writes: > On Fri, Sep 9, 2011 at 3:23 AM, Carlos Martín Nieto <cmn@xxxxxxxx> wrote: > > On Thu, 2011-09-08 at 21:37 -0500, neubyr wrote: >>> I have a test git repository with just two files in it. One of the >>> file in it has a set of two lines that is repeated n times. >>> e.g.: >>> {{{ >>> $ for i in {1..5}; do cat ./lexico.txt>> lexico1.txt && cat >>> ./lexico.txt>> lexico1.txt && mv ./lexico1.txt ./lexico.txt; done >>> }}} >>> >> >> So you've just created some data that can be compressed quite >> efficiently. >> >>> I ran above command few times and performed commit after each run. Now >>> disk usage of this repository directory is mentioned below. The 419M >>> is working directory size and 2.7M is git repository/database size. >>> >>> {{{ >>> $ du -h -d 1 . >>> 2.7M ./.git >>> 419M . >>> >>> }}} Have you tried the same but with $ git gc --prune=now before running `du`? >>> Is it because of the compression performed by git before storing data >>> (or before sending commit)?? >> >> Yes. Git stores its objects (the commit, the snapshot of the files, >> etc.) compressed. When these objects are stored in a pack, the size can >> be further reduced by storing some objects as deltas which describe the >> difference between itself and some other object in the object-db. > > Does git store deltas for some files? I thought it uses snapshots > (exact copy of staged files) only. When creating packfile from loose objects (e.g. via `git gc`), it does perform delta compression. -- Jakub Narębski -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html