On Sat, Apr 26, 2014 at 9:50 AM, David Kastrup <dak@xxxxxxx> wrote: > Shawn Pearce <spearce@xxxxxxxxxxx> writes: > >> On Sat, Apr 26, 2014 at 12:48 AM, David Kastrup <dak@xxxxxxx> wrote: >>> Shawn Pearce <spearce@xxxxxxxxxxx> writes: >>>> >>>> And JGit was already usually slower than git-core. Now it will be >>>> even slower! :-) >>> >>> If your statement about JGit is accurate, it should likely have beat >>> Git for large use cases (where the performance improvements are most >>> important) as O(n) beats O(n^2) in the long run. >> >> Agreed. >> >> In a few cases yes, JGit did beat git-core at blame running time. >> Unfortunately according to my profiling blame performance is still >> dominated by inflation and scanning of commit and tree objects to >> identify unmodified blobs and advance to the next scapegoat ancestor. > > Oh, the C version is most certainly significantly impacted by that after > my patch. One can _significantly_ speed it up by increasing > core.deltaBaseCacheLimit from its rather silly value of 16M. If you > have a comparable control in JGit and if your benchmarking points to the > unpacking, that's where I'd suggest tweaking first. Good point, we have that control but I always forget to play with it during benchmarking. >>> Apart from the objective measurement of "total time", the more >>> subjective impression of interactive/incremental response (like in >>> git gui blame) where the order of results will significantly differ >>> (current git-blame --incremental focuses on getting blames resolved >>> in first-lines-first manner, the proposed git-blame rather works on a >>> newest-commits-first basis which might better match typical use >>> cases) might be worth reporting. >> >> Seeing this fill during execution was the initial motivation I had for >> writing git gui blame. > > It does not look impressively better to me, actually. Probably because > "git gui blame" is running "git blame" several times. IIRC git gui blame runs blame only twice. Once with the simple blame, and then again with -M -C or something like that. Its a nice display because you can see who moved code here, and then who originally wrote it. Unfortunately it has to be done as two passes as the blame implementation does not know how to compute both in one pass. The move/copy detection is also computationally more expensive per revision visited, so it really slows down the simple "who put it here" if the two passes were somehow run in a single iteration. >> I don't think anyone cares about the order it displays in. In fact >> ordering my timestamp may be more what the user wants anyway, as you >> suggest above. > > What the user wants anyway is that "git gui blame" notifies "git blame" > of the currently displayed window area whenever that changes, and that > git blame then _first_ deals with all chunks with a visible on-screen > part. That is a really good idea. Unfortunately finding that part of the window in the blame data can still take a while. You may compute other parts of the file anyway while looking for this part, as you had to compare two revisions and scapegoat sections to find out that commit didn't touch the interesting region and keep going back through history. You may get lucky and be able to fill in a recent region quickly, but I somehow suspect giving priority to the visible region in the priority queue won't really help to reduce latency of the visible region for the user. > The really tricky/expensive part was realizing that as opposed to target > line number ranges, source line number ranges may overlap and/or be > duplicate when using -M or -C options. That really messed things up for > a long time and was hard to debug. Once I figured out what was going > wrong, recoding the respective stuff was straightforward. Right, and JGit blame still is missing the -M and -C options, as I have not implemented those yet. I got basic blame and reverse blame working a few years ago and then stopped working on the code for a while. Now we have interest in improving the latency for $DAY_JOB, so I've been poking at the code again for the last week or so. But that -M and -C thing is still not implemented, and I know its going to be tricky to get right with the way the scapegoating is passed along. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html