On Wed, Mar 30, 2016 at 12:31:30PM -0700, Jacob Keller wrote: > So far I've only found a single location that ends up looking worse > within the Linux kernel. Diffs of some Kbuild settings result in > something like > > before: > > If unsure, say Y. > + > +config RMI4_I2C > + tristate "RMI4 I2C Support" > + depends on RMI4_CORE && I2C > + help > + Say Y here if you want to support RMI4 devices connected to an I2C > + bus. > + > + If unsure, say Y. > > after: > > required for all RMI4 device support. > > + If unsure, say Y. > + > +config RMI4_I2C > + tristate "RMI4 I2C Support" > + depends on RMI4_CORE && I2C > + help > + Say Y here if you want to support RMI4 devices connected to an I2C > + bus. > + > If unsure, say Y. > > So in this particular instance which has multiple blank lines and is a > similar issue as with Stefan's note above, this is where the heuristic > falls apart. At least for C code this is basically vanishingly small > compared to the number of comment header fix ups. > > I think it may be that Stefan's suggestions above may be on the right > track to resolve that too. This is a tricky one. There _aren't_ actually multiple blank lines in the ambiguous area, because this particular example comes at the very end of the file. Try: git show 8d99758dee3 drivers/input/rmi4/Kconfig which adds a block in the middle of the file. It looks good both before and after running through the script. Now look at this example: git show fdf51604f10 drivers/input/rmi4/Kconfig which looks like: diff --git a/drivers/input/rmi4/Kconfig b/drivers/input/rmi4/Kconfig index 5ea60e3..cc3f7c5 100644 --- a/drivers/input/rmi4/Kconfig +++ b/drivers/input/rmi4/Kconfig @@ -8,3 +8,12 @@ config RMI4_CORE required for all RMI4 device support. If unsure, say Y. + +config RMI4_I2C + tristate "RMI4 I2C Support" + depends on RMI4_CORE && I2C + help + Say Y here if you want to support RMI4 devices connected to an I2C + bus. + + If unsure, say Y. Note that there is no trailing context, as we're adding at the end of the file. So the ambiguous portion consists of only two lines: an empty line, and "If unsure...". And we bump the latter to the top, per the heuristic (it's the exact opposite of every other case, where the blank line is a true delimiter). As a human, I think the indentation here is the real syntactic clue. But getting into indentation heuristics is probably insane. The script could probably make this work by disabling itself if the hunk is at the end of the diffed file (i.e., we don't see more context lines afterward). That covers any case like this where newline _is_ a delimiter, but we just have some internal newlines, too. It wouldn't cover the case where we had internal newlines but used some other paragraph delimiter, but based on the results so far, that seems rather rare. Something like this: --- foo.pl.orig 2016-03-31 09:44:44.281232230 -0400 +++ foo.pl 2016-03-31 09:44:34.901232632 -0400 @@ -24,13 +24,15 @@ push @hunk, $_; $state = STATE_IN_CHUNK; } else { - flush_hunk(); + print @hunk; + @hunk = (); $state = STATE_NONE; print; } } } -flush_hunk(); +print @hunk; +@hunk = (); sub flush_hunk { my $context_len = 0; -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html