Re: [PATCH v2] diff: implement config.diff.renames=copies-harder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sam James <sam@xxxxxxxxxx> writes:

> [[PGP Signed Part:Undecided]]
>
> Elijah Newren <newren@xxxxxxxxx> writes:
>
>> Hi,
>>
>> It helps when providing a new iteration of a patch to do two things:
>>   (1) provide a range-diff against the previous version to make it
>> easier for reviewers to see what has changed
>>   (2) either reply to the previous submission or link to it in your
>> header so reviewers can find previous comments
>
> Thanks, I'll do that for upcoming v3.
>
>>
>> In this case, v1 was over at
>> https://lore.kernel.org/git/pull.1606.git.1699010701704.gitgitgadget@xxxxxxxxx/.
>>
>> On Tue, Dec 26, 2023 at 12:21 PM Sam James <sam@xxxxxxxxxx> wrote:
>>>
>>> This patch adds a config value for 'diff.renames' called 'copies-harder'
>>> which make it so '-C -C' is in effect always passed for 'git log -p',
>>> 'git diff', etc.
>>>
>>> This allows specifying that 'git log -p', 'git diff', etc should always act
>>> as if '-C --find-copies-harder' was passed.
>>>
>>> It has proven this especially useful for certain types of repository (like
>>> Gentoo's ebuild repositories) because files are often copies of a previous
>>> version:
>>>
>>> Suppose a directory 'sys-devel/gcc' contains recipes for building
>>> GCC, with one file for each supported upstream branch:
>>>    gcc-13.x.build.recipe
>>>    gcc-12.x.build.recipe
>>>    gcc-11.x.build.recipe
>>>    gcc-10.x.build.recipe
>>>
>>> gcc-13.x.build.recipe was started as a copy of gcc-12.x.build.recipe
>>> (which was started as a copy of gcc-11.x.build.recipe, etc.). Previous versions
>>> are kept around to support parallel installation of multiple versions.
>>>
>>> Being able to easily observe the diff relative to other recipes within the
>>> directory has been a quality of life improvement for such repo layouts.
>>>
>>> Cc: Elijah Newren <newren@xxxxxxxxx>
>>> Cc: Junio C Hamano <gitster@xxxxxxxxx>
>>> Signed-off-by: Sam James <sam@xxxxxxxxxx>
>>
>> While the "Cc:" lines don't hurt, note that everything above this
>> point is included in the commit message, so we'd typically rather
>> those two lines be below the triple-dashed line.
>
> ACK.
>
>>
>>> ---
>>> v2: Improve the commit message using Elijah Newren's example (it is indeed
>>> precisely what I was thinking of, but phrased better), and improve the documentation
>>> to explain better what the new config option actually does & refer to git-diff's
>>> --find-copies-harder.
>>>
>>>  Documentation/config/diff.txt   |  8 +++++---
>>>  Documentation/config/status.txt |  3 ++-
>>>  diff.c                          | 12 +++++++++---
>>>  diff.h                          |  1 +
>>>  diffcore-rename.c               |  4 ++--
>>>  merge-ort.c                     |  2 +-
>>>  merge-recursive.c               |  2 +-
>>>  7 files changed, 21 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
>>> index bd5ae0c337..cdd8a74ec0 100644
>>> --- a/Documentation/config/diff.txt
>>> +++ b/Documentation/config/diff.txt
>>> @@ -131,9 +131,11 @@ diff.renames::
>>>         Whether and how Git detects renames.  If set to "false",
>>>         rename detection is disabled. If set to "true", basic rename
>>>         detection is enabled.  If set to "copies" or "copy", Git will
>>> -       detect copies, as well.  Defaults to true.  Note that this
>>> -       affects only 'git diff' Porcelain like linkgit:git-diff[1] and
>>> -       linkgit:git-log[1], and not lower level commands such as
>>> +       detect copies, as well.  If set to "copies-harder", Git will spend extra
>>> +       cycles to find more copies even in unmodified paths, see
>>> +       '--find-copies-harder' in linkgit:git-diff[1]. Defaults to true.
>>> +       Note that this affects only 'git diff' Porcelain like linkgit:git-diff[1]
>>> +       and linkgit:git-log[1], and not lower level commands such as
>>>         linkgit:git-diff-files[1].
>>>
>>>  diff.suppressBlankEmpty::
>>> diff --git a/Documentation/config/status.txt b/Documentation/config/status.txt
>>> index 2ff8237f8f..7ca7a4becd 100644
>>> --- a/Documentation/config/status.txt
>>> +++ b/Documentation/config/status.txt
>>> @@ -33,7 +33,8 @@ status.renames::
>>>         Whether and how Git detects renames in linkgit:git-status[1] and
>>>         linkgit:git-commit[1] .  If set to "false", rename detection is
>>>         disabled. If set to "true", basic rename detection is enabled.
>>> -       If set to "copies" or "copy", Git will detect copies, as well.
>>> +       If set to "copies" or "copy", Git will detect copies, as well.  If
>>> +       set to "copies-harder", Git will try harder to detect copies.
>>
>> It'd be nice to have the improved text from diff.renames here ("If set
>> to "copies-harder", Git will spend extra cycles to find more copies
>> even in unmodified paths.").
>>
>>>         Defaults to the value of diff.renames.
>>>
>>>  status.showStash::
>>> diff --git a/diff.c b/diff.c
>>> index a2def45644..585acf9a99 100644
>>> --- a/diff.c
>>> +++ b/diff.c
>>> @@ -206,8 +206,11 @@ int git_config_rename(const char *var, const char *value)
>>>  {
>>>         if (!value)
>>>                 return DIFF_DETECT_RENAME;
>>> +       if (!strcasecmp(value, "copies-harder"))
>>> +               return DIFF_DETECT_COPY_HARDER;
>>>         if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
>>> -               return  DIFF_DETECT_COPY;
>>> +               return DIFF_DETECT_COPY;
>>> +
>>
>> I pointed out in response to v1 that these last couple lines, while a
>> nice cleanup, should be in a separate patch.
>>
>>>         return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
>>>  }
>>>
>>> @@ -4849,8 +4852,11 @@ void diff_setup_done(struct diff_options *options)
>>>         else
>>>                 options->flags.diff_from_contents = 0;
>>>
>>> -       if (options->flags.find_copies_harder)
>>> +       /* Just fold this in as it makes the patch-to-git smaller */
>>> +       if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {
>>
>> Again, I already responded to v1 that four of your lines were too long
>> and needed to be split.  All four remain unsplit in your resubmission.
>> For future reference, when you resubmit a patch, please either (a)
>> first respond to the review explaining why you don't agree with the
>> suggested changes, or (b) include the fixes reviewers point out, or
>> (c) state in your cover letter or explanation after the '---' why you
>> chose to not include the suggested changes.
>
> I apologise for the rookie errors, I don't really have an excuse either.
>
>>
>>> +               options->flags.find_copies_harder = 1;
>>>                 options->detect_rename = DIFF_DETECT_COPY;
>>> +       }
>>>
>>>         if (!options->flags.relative_name)
>>>                 options->prefix = NULL;
>>> @@ -5281,7 +5287,7 @@ static int diff_opt_find_copies(const struct option *opt,
>>>         if (*arg != 0)
>>>                 return error(_("invalid argument to %s"), opt->long_name);
>>>
>>> -       if (options->detect_rename == DIFF_DETECT_COPY)
>>> +       if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)
>>>                 options->flags.find_copies_harder = 1;
>>>         else
>>>                 options->detect_rename = DIFF_DETECT_COPY;
>>> diff --git a/diff.h b/diff.h
>>> index 66bd8aeb29..b29e5b777f 100644
>>> --- a/diff.h
>>> +++ b/diff.h
>>> @@ -555,6 +555,7 @@ int git_config_rename(const char *var, const char *value);
>>>
>>>  #define DIFF_DETECT_RENAME     1
>>>  #define DIFF_DETECT_COPY       2
>>> +#define DIFF_DETECT_COPY_HARDER 3
>>>
>>>  #define DIFF_PICKAXE_ALL       1
>>>  #define DIFF_PICKAXE_REGEX     2
>>> diff --git a/diffcore-rename.c b/diffcore-rename.c
>>> index 5a6e2bcac7..856291d66f 100644
>>> --- a/diffcore-rename.c
>>> +++ b/diffcore-rename.c
>>> @@ -299,7 +299,7 @@ static int find_identical_files(struct hashmap *srcs,
>>>                 }
>>>                 /* Give higher scores to sources that haven't been used already */
>>>                 score = !source->rename_used;
>>> -               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
>>> +               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)
>>>                         continue;
>>>                 score += basename_same(source, target);
>>>                 if (score > best_score) {
>>> @@ -1405,7 +1405,7 @@ void diffcore_rename_extended(struct diff_options *options,
>>>         trace2_region_enter("diff", "setup", options->repo);
>>>         info.setup = 0;
>>>         assert(!dir_rename_count || strmap_empty(dir_rename_count));
>>> -       want_copies = (detect_rename == DIFF_DETECT_COPY);
>>> +       want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);
>>>         if (dirs_removed && (break_idx || want_copies))
>>>                 BUG("dirs_removed incompatible with break/copy detection");
>>>         if (break_idx && relevant_sources)
>>> diff --git a/merge-ort.c b/merge-ort.c
>>> index 6491070d96..7749835465 100644
>>> --- a/merge-ort.c
>>> +++ b/merge-ort.c
>>> @@ -4782,7 +4782,7 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
>>>          * sanity check them anyway.
>>>          */
>>>         assert(opt->detect_renames >= -1 &&
>>> -              opt->detect_renames <= DIFF_DETECT_COPY);
>>> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>>>         assert(opt->verbosity >= 0 && opt->verbosity <= 5);
>>>         assert(opt->buffer_output <= 2);
>>>         assert(opt->obuf.len == 0);
>>> diff --git a/merge-recursive.c b/merge-recursive.c
>>> index e3beb0801b..d52dd53660 100644
>>> --- a/merge-recursive.c
>>> +++ b/merge-recursive.c
>>> @@ -3708,7 +3708,7 @@ static int merge_start(struct merge_options *opt, struct tree *head)
>>>         assert(opt->branch1 && opt->branch2);
>>>
>>>         assert(opt->detect_renames >= -1 &&
>>> -              opt->detect_renames <= DIFF_DETECT_COPY);
>>> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>>>         assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
>>>                opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
>>>         assert(opt->rename_limit >= -1);
>>> --
>>> 2.43.0
>>
>> The patch looks close, but it does continue to violate style
>> guidelines and make unrelated changes, like v1.  And the wording in
>> one of the documentation blurbs could be improved a little.  Looking
>> forward to a v3 with those fixed.
>
> I've not finished checking yet, but I think
> 5825268db1058516d05be03d6a8d8d55eea5a943 fixed a bug which I'd been
> relying on for something, where I may need to add another option for git
> log to suppress copies (for scripts).
>
> Should I send that in a series when doing v3 if it ends up being
> required?

Ah, nevermind on that, I figured it out & no need.

>
> thanks,
> sam
>
> [[End of PGP Signed Part]]

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux