Re: [PATCH 3/4] diff-highlight: match up lines before highlighting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 03, 2015 at 04:44:16PM -0500, Jeff King wrote:

> Have you looked at a diff of the old/new output on something like
> git.git?

This should be pretty easy to do (and time). I tried:

  git log --oneline --color -p >base
  time perl highlight.old <base >old
  time perl highlight.new <base >new
  diff -c old new | less

where the "highlight.*" scripts are the versions at master, and master
with your series applied.

Your is _much_ slower. I get:

  real    0m25.538s
  user    0m25.420s
  sys     0m0.120s

for the old versus:

  real    2m3.580s
  user    2m3.548s
  sys     0m0.156s

for your series. In an interactive setting, the latency may not be that
noticeable, but if you are digging far into history (e.g., "git log -p",
then using "/" in less to search for a commit or some test), I suspect
it would be very noticeable.

I was thinking there was some low-hanging fruit in memoizing the
calculations, but even the prefix/suffix computation is pairwise. I'm
not really sure how to make this much faster.

As for the output itself, the diff between the two looks promising. The
first several cases I looked at ar strict improvements. Some of them are
kind of weird, especially in English text. For example, see the RelNotes
update in 2635c2b. The base diff is:

    * "git rebase -i" had a minor regression recently, which stopped
    considering a line that begins with an indented '#' in its insn
-   sheet not a comment, which is now fixed.
-   (merge 1db168e gr/rebase-i-drop-warn later to maint).
+   sheet not a comment. Further, the code was still too picky on
+   Windows where CRLF left by the editor is turned into a trailing CR
+   on the line read via the "read" built-in command of bash.  Both of
+   these issues are now fixed.
+   (merge 39743cf gr/rebase-i-drop-warn later to maint).

Before we highlighted nothing, and now we hone in on "now fixed". Which
is _sort of_ a match, but really the whole sentence structure has
changed. If this is the worst regression, I can certainly live with it.
And even a proper LCS diff would probably end up making spaghetti of a
text change like this.

In the other thread I mentioned earlier, the solution I cooked up was
dropping highlighting entirely for hunks over a certain percentage of
highlighting. I wonder if we could do something similar here (e.g.,
don't match lines where more than 50% of the line would be highlighted).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]