On Wed, Oct 22, 2008 at 05:55:14PM -0400, Edward Ned Harvey wrote: > I'm talking about 40-50,000 files, on multi-user production linux, > which means the cache is never warm, except when I'm benchmarking. Well, if you have a cold cache it's going to take longer. :) You should probably benchmark if you want to know exactly how long. > Specifically RHEL 4 with the files on NFS mount. Cold cache "svn st" > takes ~10 mins. Warm cache 20-30 sec. Surprisingly to me, Wow, that is awful. For comparison, "git status" from a cold on the kernel repo takes me 17 seconds. From a warm cache, less than half a second. Yes, the cold cache case would probably be better with inotify, but compared to svn, that's screaming fast. I haven't used perforce. If your bottleneck really is stat'ing the tree, then yes, something that avoided that might perform better (but weigh that particular optimization against other things which might be slower). > Out of curiosity, what are they talking about, when they say "git is > fast?" Well, there are the numbers above. When comparing to SVN or (god forbid) CVS, there are order of magnitude speedups for most common operations. > Just the fact that it's all local disk, or is there more to it > than that? I could see - git would probably outperform perforce for The things that generally make git fast are: - using a compact on-disk structure (including zlib and aggressive delta-finding) to keep your cache warm (and when it's not warm, to get data off the disk as quickly as possible) - the content-addressable nature of objects means we can just look at the data we need to solve a problem. For example, getting the history between point A and point B is "O(the number of commits between A and B)", _not_ "O(the size of the repo)". Viewing a log without generating diffs is "O(the number of commits)", not "O(some combination of the number of commits and the number of files in each commit)". Diffing two points in history is "O(the size of the differences between the two points)" and is totally independent of the number of commits between the two points. - most operations are streamable. "git log >/dev/null" on the kernel repo (about 90,000 commits) takes 8.5 seconds on my box. But it starts generating output immediately, so it _feels_ instant, and the rest of the data is generated while I read the first commit in my pager. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html