Re: weird diff output?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 30, 2016 at 12:31:30PM -0700, Jacob Keller wrote:

> So far I've only found a single location that ends up looking worse
> within the Linux kernel. Diffs of some Kbuild settings result in
> something like
> 
> before:
> 
>           If unsure, say Y.
> +
> +config RMI4_I2C
> +       tristate "RMI4 I2C Support"
> +       depends on RMI4_CORE && I2C
> +       help
> +         Say Y here if you want to support RMI4 devices connected to an I2C
> +         bus.
> +
> +         If unsure, say Y.
> 
> after:
> 
>           required for all RMI4 device support.
> 
> +         If unsure, say Y.
> +
> +config RMI4_I2C
> +       tristate "RMI4 I2C Support"
> +       depends on RMI4_CORE && I2C
> +       help
> +         Say Y here if you want to support RMI4 devices connected to an I2C
> +         bus.
> +
>           If unsure, say Y.
> 
> So in this particular instance which has multiple blank lines and is a
> similar issue as with Stefan's note above, this is where the heuristic
> falls apart. At least for C code this is basically vanishingly small
> compared to the number of comment header fix ups.
> 
> I think it may be that Stefan's suggestions above may be on the right
> track to resolve that too.

This is a tricky one. There _aren't_ actually multiple blank lines in
the ambiguous area, because this particular example comes at the very
end of the file. Try:

  git show 8d99758dee3 drivers/input/rmi4/Kconfig

which adds a block in the middle of the file. It looks good both before
and after running through the script. Now look at this example:

  git show fdf51604f10 drivers/input/rmi4/Kconfig

which looks like:

diff --git a/drivers/input/rmi4/Kconfig b/drivers/input/rmi4/Kconfig
index 5ea60e3..cc3f7c5 100644
--- a/drivers/input/rmi4/Kconfig
+++ b/drivers/input/rmi4/Kconfig
@@ -8,3 +8,12 @@ config RMI4_CORE
          required for all RMI4 device support.
 
          If unsure, say Y.
+
+config RMI4_I2C
+       tristate "RMI4 I2C Support"
+       depends on RMI4_CORE && I2C
+       help
+         Say Y here if you want to support RMI4 devices connected to an I2C
+         bus.
+
+         If unsure, say Y.


Note that there is no trailing context, as we're adding at the end of
the file. So the ambiguous portion consists of only two lines: an empty
line, and "If unsure...". And we bump the latter to the top, per the
heuristic (it's the exact opposite of every other case, where the blank
line is a true delimiter).

As a human, I think the indentation here is the real syntactic clue. But
getting into indentation heuristics is probably insane.

The script could probably make this work by disabling itself if the hunk
is at the end of the diffed file (i.e., we don't see more context lines
afterward). That covers any case like this where newline _is_ a
delimiter, but we just have some internal newlines, too. It wouldn't
cover the case where we had internal newlines but used some other
paragraph delimiter, but based on the results so far, that seems rather
rare.

Something like this:

--- foo.pl.orig	2016-03-31 09:44:44.281232230 -0400
+++ foo.pl	2016-03-31 09:44:34.901232632 -0400
@@ -24,13 +24,15 @@
       push @hunk, $_;
       $state = STATE_IN_CHUNK;
     } else {
-      flush_hunk();
+      print @hunk;
+      @hunk = ();
       $state = STATE_NONE;
       print;
     }
   }
 }
-flush_hunk();
+print @hunk;
+@hunk = ();
 
 sub flush_hunk {
   my $context_len = 0;

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]