On 12/11/07, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Tue, 11 Dec 2007, Daniel Berlin wrote: > > > > This seems to be a common problem with git. It seems to use a lot of > > memory to perform common operations on the gcc repository (even though > > it is faster in some cases than hg). > > The thing is, git has a very different notion of "common operations" than > you do. > > To git, "git annotate" is just about the *last* thing you ever want to do. > It's not a common operation, it's a "last resort" operation. In git, the > whole workflow is designed for "git log -p <pathnamepattern>" rather than > annotate/blame. > I understand this, and completely agree with you. However, I cannot force GCC people to adopt completely new workflow in this regard. The changelog's are not useful enough (and we've had huge fights over this) to do git log -p and figure out the info we want. Looking through thousands of diffs to find the one that happened to your line is also pretty annoying. Annotate is a major use for gcc developers as a result I wish I could fix this silliness, but i can't :) > That said, I'll see if I can speed up "git blame" on the gcc repository. > It _is_ a fundamentally much more expensive operation than it is for > systems that do single-file things. SVN had the same problem (the file retrieval was the most expensive op on FSFS). One of the things i did to speed it up tremendously was to do the annotate from newest to oldest (IE in reverse), and stop annotating when we had come up with annotate info for all the lines. If you can't speed up file retrieval itself, you can make it need less files :) In GCC history, it is likely you will be able to cut off at least 30% of the time if you do this, because files often have changed entirely multiple times. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html