On Tue, 11 Dec 2007, Daniel Berlin wrote: > > This seems to be a common problem with git. It seems to use a lot of > memory to perform common operations on the gcc repository (even though > it is faster in some cases than hg). The thing is, git has a very different notion of "common operations" than you do. To git, "git annotate" is just about the *last* thing you ever want to do. It's not a common operation, it's a "last resort" operation. In git, the whole workflow is designed for "git log -p <pathnamepattern>" rather than annotate/blame. In fact, we didn't support annotate at all for the first year or so of git. The reason for git being relatively slow is exactly that git doesn't have "file history" at all, and only tracks full snapshots. So "git blame" is really a very complex operation that basically looks at the global history (because nothing else exists) and will basically generate a totally different "view" of local history from that one. The disadvantage is that it's much slower and much more costly than just having a local history view to begin with. However, the absolutely *huge* advantage is that it isn't then limited to local history. So where git shines is when you actually use the global history, and do merges or when you track more than one file (which others find hard, but git finds much more natural). An examples of this is content that actually comes from multiple files. File-based systems simply cannot do this at all. They aren't just slower, they are totally unable to do it sanely. For git, it's all the same: it never really cares about file boundaries in the first place. The other example is doing things like "git log -p drivers/char", where you don't ask for the log of a single file, but a general file pattern, and get (still atomic!) commits as the result. And perhaps the best example is just tracking code when you have two files that merge into one (possibly because the "same" file was created independently in two different branches). git gets things like that right without even thinking about it. Others tend to just flounder about and can't do anything at all about it. That said, I'll see if I can speed up "git blame" on the gcc repository. It _is_ a fundamentally much more expensive operation than it is for systems that do single-file things. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html