Am Dienstag, den 20.05.2014, 17:24 +0000 schrieb Stewart, Louis (IS): > Thanks for the reply. I just read the intro to GIT and I am concerned > about the part that it will copy the whole repository to the developers > work area. They really just need the one directory and files under > that one directory. The history has TBs of data. > > Lou > > -----Original Message----- > From: Junio C Hamano [mailto:gitster@xxxxxxxxx] > Sent: Tuesday, May 20, 2014 1:18 PM > To: Stewart, Louis (IS) > Cc: git@xxxxxxxxxxxxxxx > Subject: EXT :Re: GIT and large files > > "Stewart, Louis (IS)" <louis.stewart@xxxxxxx> writes: > > > Can GIT handle versioning of large 20+ GB files in a directory? > > I think you can "git add" such files, push/fetch histories that > contains such files over the wire, and "git checkout" such files, but > naturally reading, processing and writing 20+GB would take some time. > In order to run operations that need to see the changes, e.g. "git log > -p", a real content-level merge, etc., you would also need sufficient > memory because we do things in-core. Preventing that a clone fetches the whole history can be done with the --depth option of git clone. The question is what do you want to do with these 20G files? Just store them in the repo and *very* occasionally change them? For that you need a 64bit compiled version of git with enough ram. 32G does the trick here. Everything with git 1.9.1. Doing some tests on my machine with a normal harddisc gives (sorry for LC_ALL != C): $time git add file.dat; time git commit -m "add file"; time git status real 16m17.913s user 13m3.965s sys 0m22.461s [master 15fa953] add file 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 file.dat real 15m36.666s user 13m26.962s sys 0m16.185s # Auf Branch master nichts zu committen, Arbeitsverzeichnis unverändert real 11m58.936s user 11m50.300s sys 0m5.468s $ls -lh -rw-r--r-- 1 thomas thomas 20G Mai 20 19:01 file.dat So this works but aint fast. Playing some tricks with --assume-unchanged helps here: $git update-index --assume-unchanged file.dat $time git status # Auf Branch master nichts zu committen, Arbeitsverzeichnis unverändert real 0m0.003s user 0m0.000s sys 0m0.000s This trick is only save if you *know* that file.dat does not change. And btw I also set $cat .gitattributes *.dat -delta as delta compresssion should be skipped in any case. Pushing and pulling these files to and from a server needs some tweaking on the server side, otherwise the occasional git gc might kill the box. Btw. I happily have files with 1.5GB in my git repositories and also change them. And also work with git for windows. So in this region of file sizes things work quite well. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html