Re: git blame performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/06/2015 02:37 PM, Jan Smets wrote:
> I have recently migrated a fairly large project from CVS to Git.
> One of the issues we're having is the blame/annotate performance.
> [...]
> cvs annotate of the same file (over the network) is ready in 0.8 seconds.
> blame/annotate is a frequently used operation, ranging between 5 to 20
> usages a day per developer.

cvs annotate and git blame both have to follow history back until they
find the commit that introduced the oldest line that is still in the
current version of the file. So for a really old file, a lot of history
has to be walked through.

The reason that cvs annotate is so much faster than git blame is that
CVS stores revisions filewise, with all of the modifications to file
$FILE being stored in a single $FILE,v file. So in the worst case, CVS
only has to read this one file.

Git, on the other hand, stores revisions treewise. It has no way of
knowing, ab initio, which revisions touched a given file. (In fact, this
concept is not even well-defined because the answer depends on things
like whether copy (-C) and move (-M) detection are turned on and what
parameters they were given.) This means that git blame has to traverse
most of history to find the commits that touched $FILE.

Slow git blame is thus a relatively unavoidable consequence of Git's
data model. That's not to say that it can't be sped up somewhat, but it
will never reach CVS speeds.

But it does have some features that can reduce the work:

-L <start>,<end>, -L :<funcname> -- Annotate only the given line range.
This option can speed things up (1) if the range of lines does not
include the oldest lines, (2) by limiting which parents of merge commits
have to be followed.

--incremental -- if you are using this command to build tooling, this
option allows partial results to be returned early, to reduce the wait
until the user sees something.

If you are not interested in changes older than a certain date or
revision, you can limit the amount of history that git blame traverses.
See SPECIFYING RANGES in the manpage.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]