Re: [PATCH 8/8] diff: improve positioning of add/delete blocks in diffs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 4, 2016 at 12:56 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Thu, Aug 04, 2016 at 12:00:36AM +0200, Michael Haggerty wrote:
>
>> This table shows the number of diff slider groups that were positioned
>> differently than the human-generated values, for various repositories.
>> "default" is the default "git diff" algorithm. "compaction" is Git 2.9.0
>> with the `--compaction-heuristic` option "indent" is an earlier,
>
> s/option/&./
>
>>  static int diff_detect_rename_default;
>> +static int diff_indent_heuristic; /* experimental */
>>  static int diff_compaction_heuristic; /* experimental */
>
> These two flags are mutually exclusive in the xdiff code, so we should
> probably handle that here.
>
> TBH, I do not care that much what:
>
>   [diff]
>   compactionHeuristic = true
>   indentHeuristic = true
>
> does. But right now:
>
>   git config diff.compactionHeuristic true
>   git show --indent-heuristic
>
> still prefers the compaction heuristic, which I think is objectively
> wrong.
>
> So perhaps we need a single variable:
>
>   enum {
>     DIFF_HEURISTIC_COMPACTION,
>     DIFF_HEURISTIC_INDENT
>   } diff_heuristic;
>
> and set it in last-one-wins fashion (it would be nice if the config and
> command line options were shaped the same way so it's clear to the user
> that they are exclusive, but we may have to keep --compaction-heuristic
> around for compatibility, as an alias for --diff-heuristic=compaction).
>
>> diff --git a/git-add--interactive.perl b/git-add--interactive.perl
>> index 642cce1..ee3d812 100755
>> --- a/git-add--interactive.perl
>> +++ b/git-add--interactive.perl
>> @@ -45,6 +45,7 @@ my ($diff_new_color) =
>>  my $normal_color = $repo->get_color("", "reset");
>>
>>  my $diff_algorithm = $repo->config('diff.algorithm');
>> +my $diff_indent_heuristic = $repo->config_bool('diff.indentheuristic');
>>  my $diff_compaction_heuristic = $repo->config_bool('diff.compactionheuristic');
>
> Nice touch.
>
> Unfortunately the mutual-exclusivity handling will probably bleed over
> to here, too.
>
>> +/*
>> + * If a line is indented more than this, get_indent() just returns this value.
>> + * This avoids having to do absurd amounts of work for data that are not
>> + * human-readable text, and also ensures that the output of get_indent fits within
>> + * an int.
>> + */
>> +#define MAX_INDENT 200
>
> Speaking of absurd amounts of work, I was curious if there was a
> noticeable performance penalty for using this heuristic (just because
> it's a lot more complicated than the others). I couldn't detect any
> differences running "git log -p --no-merges -3000" on git.git with no
> heuristic, compaction, and indent. There may be other repositories that
> behave more pathologically (it looks like having 20 blank lines at the
> end of each hunk?), but I'd guess in most cases this will always be
> drowned out in the noise of doing the actual diff.
>
>> +#define START_OF_FILE_BONUS 9
>> +#define END_OF_FILE_BONUS 46
>> +#define TOTAL_BLANK_WEIGHT 4
>> +#define PRE_BLANK_WEIGHT 16
>> +#define RELATIVE_INDENT_BONUS -1
>> +#define RELATIVE_INDENT_HAS_BLANK_BONUS 15
>> +#define RELATIVE_OUTDENT_BONUS -19
>> +#define RELATIVE_OUTDENT_HAS_BLANK_BONUS 2
>> +#define RELATIVE_DEDENT_BONUS -63
>> +#define RELATIVE_DEDENT_HAS_BLANK_BONUS 50
>
> I see there is a comment below here mentioning that these are empirical
> voodoo, but it might be worth one at the top (or just moving these below
> the comment) because the comment looks like it's just associated with
> the function (and these are sufficiently bizarre that anybody reading is
> going to double-take on them).
>
>> +        return 10 * score - bonus;
>
> I don't mind this not "10" not being a #define constant, but after
> reading the exchange between you and Stefan, I think it would be nice to
> describe what it is in a comment. The rest of the function is commented
> so nicely that this one left me thinking "huh?" upon seeing the "10".

After a night of sleep I agree with Peffs statement here, it's not about the
#define, it's about the comment. (which the #define would have given in a
short cryptic way in angry capital letters).

I have just reread the scoring function and I think you could pull out the
`score=indent` assignment (it is always assigned except for indent <0)

        if (indent == -1)
               score = 0;
        else
               score = indent;
        ... lots of bonus computation below, which in its current implementation
        have lots of "score = indent;" lines as well.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]