On Tue, Apr 19, 2016 at 12:00 AM, Jeff King <peff@xxxxxxxx> wrote: > On Mon, Apr 18, 2016 at 11:47:52PM -0700, Stefan Beller wrote: > >> I am convinced the better way to do it is like this: >> >> Calculate the entropy for each line and take the last line with the >> lowest entropy as the last line of the hunk. > > I'll be curious to see the results, but I think sometimes predictable > and stupid may be the best route with these sorts of things. In > particular, I'd worry that a content-independent measure of entropy > might miss some subtleties of a particular language (e.g., that "*" is > more or less meaningful than some other character). But we'll see. :) I would assume that the "*" would have little entropy when there are lots of comments, i.e. it just "feels" like an empty line. If there are no "*", then the entropy is high as it is unusual. And unusual things should not be at the border of a hunk I would assume. So m prediction is that the 'subtleties of a particular language' correlate highly with the actual use of characters. Anyway, the experiment can be carried out later. :) Thanks, Stefan > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html