Re: [BUG-ish] diff compaction heuristic false positive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/10/2016 09:50 AM, Jeff King wrote:
> I found a false positive with the new compaction heuristic in v2.9:
> [...]
> I get this rather unfortunate diff:
> 
>     $ git diff
>     diff --git a/file.rb b/file.rb
>     index bd9d1cb..67fbeba 100644
>     --- a/file.rb
>     +++ b/file.rb
>     @@ -1,5 +1,11 @@
>      def foo
>        do_foo_stuff()
>      
>     +  common_ending()
>     +end
>     +
>     +def bar
>     +  do_bar_stuff()
>     +
>        common_ending()
>      end

I've often thought that indentation would be a good, fairly universal
signal for diff to use when deciding how to slide hunks around. Most
source code is indented in a way that shows its structure.

I propose the following heuristic:

* Prefer to start and end hunks following lines with the least
  indentation.

* Define the "indentation" of a blank line to be the indentation of
  the previous non-blank line minus epsilon.

* In the case of a tie, prefer to slide the hunk down as far as
  possible.

For the case above, the indentations for the candidate "before-the-hunk"
lines and the resulting hunk would be

>      def foo
> 2      do_foo_stuff()
> 2-ε
> 2      common_ending()
> 0    end
> 0-ε +
>     +def bar
>     +  do_bar_stuff()
>     +
>     +  common_ending()
>     +end

I haven't tried testing this heuristic systematically but I have the
feeling that it would be pretty effective and yet quite easy to implement.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]