On Thu, May 13, 2010 at 10:57:22PM -0400, Ali Tofigh wrote: > short version: will git handle large number of files efficiently if > the history is simple and linear, i.e., without merges? Short answer: large number of files, yes, large files, not really. The shape of history is largely irrelevant. Longer answer: Git separates the conceptual structure of history (the digraph of commits, and the pointers of commits to trees to blobs) from the actual storage of objects representing that history. Problems with large files are usually storage issues. Copying them around in packfiles is expensive, storing an extra copy in the repo is expensive, trying deltas and diffs is expensive. None of those things has to do with the shape of your history. So I would expect git to handle such a load with a linear history about as well as a complex history with merges. For large numbers of files, git generally does a good job, especially if those files are distributed throughout a directory hierarchy. But keep in mind that the git repo will store another copy of every file. They will be delta-compressed between versions, and zlib compressed overall, but you may potentially be doubling the amount of disk space required if you have a lot of uncompressible binary files. For large files, git expects to be able to pull each file into memory. Sometimes two versions if you are doing a diff. And it will copy those files around when repacking (which you will want to do for the sake of the smaller files). So files on the order of a few megabytes are not a problem. If you have files in the hundreds of megabytes or gigabytes, expect some operations to be slow (like repacking). Really, I would start by just "git add"-ing your whole filesystem, doing a "git repack -ad", and seeing how long it takes, and what the resulting size is. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html