On Fri, 2011-09-09 at 09:04 -0500, neubyr wrote: > On Fri, Sep 9, 2011 at 3:23 AM, Carlos Martín Nieto <cmn@xxxxxxxx> wrote: > > On Thu, 2011-09-08 at 21:37 -0500, neubyr wrote: > >> I have a test git repository with just two files in it. One of the > >> file in it has a set of two lines that is repeated n times. > >> e.g.: > >> {{{ > >> $ for i in {1..5}; do cat ./lexico.txt >> lexico1.txt && cat > >> ./lexico.txt >> lexico1.txt && mv ./lexico1.txt ./lexico.txt; done > >> }}} > >> > > > > So you've just created some data that can be compressed quite > > efficiently. > > > >> I ran above command few times and performed commit after each run. Now > >> disk usage of this repository directory is mentioned below. The 419M > >> is working directory size and 2.7M is git repository/database size. > >> > >> {{{ > >> $ du -h -d 1 . > >> 2.7M ./.git > >> 419M . > >> > >> }}} > >> > >> Is it because of the compression performed by git before storing data > >> (or before sending commit)?? > >> > > > > Yes. Git stores its objects (the commit, the snapshot of the files, > > etc.) compressed. When these objects are stored in a pack, the size can > > be further reduced by storing some objects as deltas which describe the > > difference between itself and some other object in the object-db. > > > > Does git store deltas for some files? I thought it uses snapshots > (exact copy of staged files) only. Yes and no. The data model for git is to always store snapshots, and it always expects to have the full files available. In a packfile, however, in order to save space, some objects are stored as deltas to other objects in the same file. http://progit.org/book/ch9-4.html > > > >> Following were results with subversion: > >> > >> Subversion client (redundant(?) copy exists in .svn/text-base/ > >> directory, hence double size in client): > >> {{{ > >> $ du -h -d 1 > >> 416M ./.svn > >> 832M . > >> }}} > > > > Subversion stores the "pristines" (which is the status of the files in > > the latest revision) inside the .svn directory. I wouldn't call this > > copy redundant, though, as it allows you to run diff locally. The > > pristines are stored uncompressed, which is why you half of the space is > > taken up by the .svn directory. > > > >> > >> Subversion repo/server: > >> {{{ > >> $ du -h -d 1 > >> 12K ./conf > >> 1.2M ./db > >> 36K ./hooks > >> 8.0K ./locks > >> 1.2M . > >> }}} > > > > I don't know how the repository is stored in Subversion, but it may also > > be compressed. You may be able to reduced your git repository size by > > (re)generating packs with 'git repack' and doing some cleanups with 'git > > gc', but the repository size is not often a concern. > > > > cmn > > > > > > > > that's helpful. thanks. > > -- > neuby.r >
Attachment:
signature.asc
Description: This is a digitally signed message part