On Wed, 10 Dec 2008, Jonathan Blanton wrote: > > I'm using Git for a project that contains huge (multi-gigabyte) files. > I need to track these files, but with some of the really big ones, > git-add aborts with the message "fatal: Out of memory, malloc failed". git is _really_ not designed for huge files. By design - good or bad - git does pretty much all single file operations with the whole file in memory as one single allocation. Now, some of that is hard to fix - or at least would generate much more complex code. The _particular_ case of "git add" could be fixed without undue pain, but it's not entirely trivial either. The main offender is probably "index_fd()" that just mmap's the whole file in one go and then calls write_sha1_file() which really expects it to be one single memory area both for the initial SHA1 create and for the compression and writing out of the result. Changing that to do big files in pieces would not be _too_ painful, but it's not just a couple of lines either. However, git performance with big files would never be wonderful, and things like "git diff" would still end up reading not just the whole file, but _both_versions_ at the same time. Marking the big files as being no-diff might help, though. Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html