On Sun, Feb 14, 2010 at 10:28 PM, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: >> >> Concrete example: in one of my repositories, the average file size is >> well over 2 gigabytes. > > Just to make extremely sure that you undertand the issue: adding these > files on a computer with 512 megabyte RAM works at the moment. Can you > guarantee that there is no regression in that respect _with_ your patch? It may not work without enough swap space, and it will not pretty anyway due to swapping. So, I see the following options: 1. to introduce a configuration parameter that will define whether to use mmap() to hash files or not. It is a trivial change, but the real question is what default value for this option (should we do some heuristic based on filesize vs available memory?) 2. to stream files in chunks. It is better because it is faster, especially on large files, as you calculate SHA-1 and zip data while they are in CPU cache. However, it may be more difficult to implement, because we have filters that should be apply to files that are put to the repository. 3. to improve Git to support huge files on computers with low memory. I think #3 is a noble goal, but I do not have time for that. I can try to take on #2, but it may take more time than I have now. As to #1, I am ready to send the patch if we agree that is the right way to go... I am open to any your suggestion. Maybe there are some options here.. Dmitry -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html