Re: [PATCH] diff: implement config.diff.renames=copies-harder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Fri, Nov 3, 2023 at 4:25 AM Sam James via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Sam James <sam@xxxxxxxxxx>
>
> This patch adds a config value for 'diff.renames' called 'copies-harder'
> which make it so '-C -C' is in effect always passed for 'git log -p',
> 'git diff', etc.
>
> This allows specifying that 'git log -p', 'git diff', etc should always act
> as if '-C --find-copies-harder' was passed.
>
> I've found this especially useful for certain types of repository (like
> Gentoo's ebuild repositories) because files are often copies of a previous
> version.

These must be very small repositories?  --find-copies-harder is really
expensive...

But, if you are willing to pay the price, the idea of making this a
configuration item makes sense.

> Signed-off-by: Sam James <sam@xxxxxxxxxx>
> ---
>     diff: implement config.diff.renames=copies-harder
>
>     This patch adds a config value for 'diff.renames' called 'copies-harder'
>     which make it so '-C -C' is in effect always passed for 'git log -p',
>     'git diff', etc.
>
>     This allows specifying that 'git log -p', 'git diff', etc should always
>     act as if '-C --find-copies-harder' was passed.
>
>     I've found this especially useful for certain types of repository (like
>     Gentoo's ebuild repositories) because files are often copies of a
>     previous version.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1606%2Fthesamesam%2Fconfig-copies-harder-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1606/thesamesam/config-copies-harder-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1606
>
>  Documentation/config/diff.txt   |  3 ++-
>  Documentation/config/status.txt |  3 ++-
>  diff.c                          | 12 +++++++++---
>  diff.h                          |  1 +
>  diffcore-rename.c               |  4 ++--
>  merge-ort.c                     |  2 +-
>  merge-recursive.c               |  2 +-
>  7 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
> index bd5ae0c3378..d2ff3c62d41 100644
> --- a/Documentation/config/diff.txt
> +++ b/Documentation/config/diff.txt
> @@ -131,7 +131,8 @@ diff.renames::
>         Whether and how Git detects renames.  If set to "false",
>         rename detection is disabled. If set to "true", basic rename
>         detection is enabled.  If set to "copies" or "copy", Git will
> -       detect copies, as well.  Defaults to true.  Note that this
> +       detect copies, as well.  If set to "copies-harder", Git will try harder
> +       to detect copies.  Defaults to true.  Note that this

"try harder to detect copies" feels like an unhelpful explanation.  I
understand that a lengthy explanation (like the one found under the
`--find-copies-harder` option in git-diff) may not be wanted here
since we are trying to describe things succinctly, but could we at
least reference the `--find-copies-harder` option so that people know
where to go to get a more detailed explanation?

>         affects only 'git diff' Porcelain like linkgit:git-diff[1] and
>         linkgit:git-log[1], and not lower level commands such as
>         linkgit:git-diff-files[1].
> diff --git a/Documentation/config/status.txt b/Documentation/config/status.txt
> index 2ff8237f8fc..7ca7a4becd7 100644
> --- a/Documentation/config/status.txt
> +++ b/Documentation/config/status.txt
> @@ -33,7 +33,8 @@ status.renames::
>         Whether and how Git detects renames in linkgit:git-status[1] and
>         linkgit:git-commit[1] .  If set to "false", rename detection is
>         disabled. If set to "true", basic rename detection is enabled.
> -       If set to "copies" or "copy", Git will detect copies, as well.
> +       If set to "copies" or "copy", Git will detect copies, as well.  If
> +       set to "copies-harder", Git will try harder to detect copies.

Same here.

>         Defaults to the value of diff.renames.
>
>  status.showStash::
> diff --git a/diff.c b/diff.c
> index 2c602df10a3..0ca906611f5 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -206,8 +206,11 @@ int git_config_rename(const char *var, const char *value)
>  {
>         if (!value)
>                 return DIFF_DETECT_RENAME;
> +       if (!strcasecmp(value, "copies-harder"))
> +               return DIFF_DETECT_COPY_HARDER;
>         if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
> -               return  DIFF_DETECT_COPY;
> +               return DIFF_DETECT_COPY;
> +

As per CodingGuidelines:
"""
 - Fixing style violations while working on a real change as a
   preparatory clean-up step is good, but otherwise avoid useless code
   churn for the sake of conforming to the style.
"""
So, the fixing of extra space and the extra blank line should be
placed in a separate patch.

>         return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
>  }
>
> @@ -4832,8 +4835,11 @@ void diff_setup_done(struct diff_options *options)
>         else
>                 options->flags.diff_from_contents = 0;
>
> -       if (options->flags.find_copies_harder)
> +       /* Just fold this in as it makes the patch-to-git smaller */
> +       if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {

As per CodingGuidelines, this line is too long and should be split
across two lines at the `||`.

> +               options->flags.find_copies_harder = 1;
>                 options->detect_rename = DIFF_DETECT_COPY;
> +       }
>
>         if (!options->flags.relative_name)
>                 options->prefix = NULL;
> @@ -5264,7 +5270,7 @@ static int diff_opt_find_copies(const struct option *opt,
>         if (*arg != 0)
>                 return error(_("invalid argument to %s"), opt->long_name);
>
> -       if (options->detect_rename == DIFF_DETECT_COPY)
> +       if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)

Also too long.

>                 options->flags.find_copies_harder = 1;
>         else
>                 options->detect_rename = DIFF_DETECT_COPY;
> diff --git a/diff.h b/diff.h
> index 66bd8aeb293..b29e5b777f8 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -555,6 +555,7 @@ int git_config_rename(const char *var, const char *value);
>
>  #define DIFF_DETECT_RENAME     1
>  #define DIFF_DETECT_COPY       2
> +#define DIFF_DETECT_COPY_HARDER 3
>
>  #define DIFF_PICKAXE_ALL       1
>  #define DIFF_PICKAXE_REGEX     2
> diff --git a/diffcore-rename.c b/diffcore-rename.c
> index 5a6e2bcac71..856291d66f2 100644
> --- a/diffcore-rename.c
> +++ b/diffcore-rename.c
> @@ -299,7 +299,7 @@ static int find_identical_files(struct hashmap *srcs,
>                 }
>                 /* Give higher scores to sources that haven't been used already */
>                 score = !source->rename_used;
> -               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
> +               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)

This line should also be split.

>                         continue;
>                 score += basename_same(source, target);
>                 if (score > best_score) {
> @@ -1405,7 +1405,7 @@ void diffcore_rename_extended(struct diff_options *options,
>         trace2_region_enter("diff", "setup", options->repo);
>         info.setup = 0;
>         assert(!dir_rename_count || strmap_empty(dir_rename_count));
> -       want_copies = (detect_rename == DIFF_DETECT_COPY);
> +       want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);

and so should this one.

>         if (dirs_removed && (break_idx || want_copies))
>                 BUG("dirs_removed incompatible with break/copy detection");
>         if (break_idx && relevant_sources)
> diff --git a/merge-ort.c b/merge-ort.c
> index 6491070d965..77498354652 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4782,7 +4782,7 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
>          * sanity check them anyway.
>          */
>         assert(opt->detect_renames >= -1 &&
> -              opt->detect_renames <= DIFF_DETECT_COPY);
> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>         assert(opt->verbosity >= 0 && opt->verbosity <= 5);
>         assert(opt->buffer_output <= 2);
>         assert(opt->obuf.len == 0);
> diff --git a/merge-recursive.c b/merge-recursive.c
> index e3beb0801b1..d52dd536606 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3708,7 +3708,7 @@ static int merge_start(struct merge_options *opt, struct tree *head)
>         assert(opt->branch1 && opt->branch2);
>
>         assert(opt->detect_renames >= -1 &&
> -              opt->detect_renames <= DIFF_DETECT_COPY);
> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>         assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
>                opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
>         assert(opt->rename_limit >= -1);
>
> base-commit: 692be87cbba55e8488f805d236f2ad50483bd7d5
> --
> gitgitgadget

The overall patch makes sense and looks good, modulo some minor
stylistic things that need cleaning up.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux