Re: [PATCH v3 2/2] diff: teach diff to read algorithm from diff driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"John Cai via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

Looking good.  Some comments below.  Many of them minor.

> +Setting the internal diff algorithm
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +The diff algorithm can be set through the `diff.algorithm` config key, but
> +sometimes it may be helpful to set the diff algorithm by path. For example, one

I would have expected "per path" instead of "by path".

> +might wish to set a diff algorithm automatically for all `.json` files such that
> +the user would not need to pass in a separate command line `--diff-algorithm`
> +flag each time.

While this is not incorrect per-se, I think the first paragraph of
the proposed commit log message was a lot more convincing.  Your
changes may not be limited to a single kind of files, and a command
line option is simply not enough.  You may want one algorithm for
".json" while using another for ".c", which was really an excellent
example you gave.

> +This diff algorithm applies to user facing diff output like git-diff(1),
> +git-show(1) and is used for the `--stat` output as well. The merge machinery
> +will not use the diff algorithm set through this method.

Is "format-patch" considered "user-facing"?

> +NOTE: If the `command` key also exists, then Git will treat this as an external
> +diff and attempt to use the value set for `command` as an external program. For
> +instance, the following config, combined with the above `.gitattributes` file,
> +will result in `command` favored over `algorithm`.
> +
> +----------------------------------------------------------------
> +[diff "<name>"]
> +  command = j-c-diff
> +  algorithm = histogram
> +----------------------------------------------------------------

Isn't this a bit too verbose, given that the reader has just seen
the external diff driver section.  I wonder something like this is
sufficient, without any sample configuration?

    NOTE: If `diff.<name>.command` is defined for path with the
    `diff=<name>` attribute, it is executed as an external diff driver
    (see above), and adding `diff.<name>.algorithm` has no effect (the
    algorithm is not passed to the external diff driver).


> diff --git a/diff.c b/diff.c
> index 5efc22ca06b..04469da6d34 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4456,15 +4456,13 @@ static void run_diff_cmd(const char *pgm,
>  	const char *xfrm_msg = NULL;
>  	int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score;
>  	int must_show_header = 0;
> +	struct userdiff_driver *drv = NULL;
>  
> -
> -	if (o->flags.allow_external) {
> -		struct userdiff_driver *drv;
> -
> +	if (o->flags.allow_external || !o->ignore_driver_algorithm)
>  		drv = userdiff_find_by_path(o->repo->index, attr_path);
> -		if (drv && drv->external)
> -			pgm = drv->external;
> -	}
> +
> +	if (o->flags.allow_external && drv && drv->external)
> +		pgm = drv->external;

OK.  There is no explicit "pgm = NULL" initialization in this
function, but that is done by the caller passing NULL to the
function as a parameter, so it all makes sense.

> @@ -4481,12 +4479,16 @@ static void run_diff_cmd(const char *pgm,
>  		run_external_diff(pgm, name, other, one, two, xfrm_msg, o);
>  		return;
>  	}
> -	if (one && two)
> +	if (one && two) {
> +		if (drv && !o->ignore_driver_algorithm && drv->algorithm)
> +			set_diff_algorithm(o, drv->algorithm);

For symmetry with the above choice of pgm we just saw, the order of
the condition might be easier to follow if written like so:

	if (!o->ignore_driver_algorithm && drv && drv->algorithm)

It would not make any measurable difference performance-wise either way.

> @@ -4583,6 +4585,14 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o,
>  	const char *name;
>  	const char *other;
>  
> +	if (!o->ignore_driver_algorithm) {
> +		struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path);

That's an overlong line.

> +
> +		if (drv && drv->algorithm) {
> +			set_diff_algorithm(o, drv->algorithm);
> +		}

No need to have {} around a single statement block.

> +	}
> +
>  	if (DIFF_PAIR_UNMERGED(p)) {
>  		/* unmerged */
>  		builtin_diffstat(p->one->path, NULL, NULL, NULL,
> @@ -5130,6 +5140,8 @@ static int diff_opt_diff_algorithm(const struct option *opt,
>  		return error(_("option diff-algorithm accepts \"myers\", "
>  			       "\"minimal\", \"patience\" and \"histogram\""));
>  
> +	options->ignore_driver_algorithm = 1;
> +
>  	return 0;
>  }
>  
> @@ -5145,6 +5157,8 @@ static int diff_opt_diff_algorithm_no_arg(const struct option *opt,
>  		BUG("available diff algorithms include \"myers\", "
>  			       "\"minimal\", \"patience\" and \"histogram\"");
>  
> +	options->ignore_driver_algorithm = 1;
> +
>  	return 0;
>  }
> @@ -5285,6 +5299,7 @@ static int diff_opt_patience(const struct option *opt,
>  	for (i = 0; i < options->anchors_nr; i++)
>  		free(options->anchors[i]);
>  	options->anchors_nr = 0;
> +	options->ignore_driver_algorithm = 1;
>  
>  	return set_diff_algorithm(options, "patience");
>  }


I was hoping that set_diff_algorithm() can be the shared common one
that signals we were told to use a specific algorithm, but it also
is called from internal codepaths so it cannot be it.

It is probably not worth introducing an extra helper that only calls
set_diff_algorithm() and sets ignore_driver_algorithm bit only for
that to reduce three-times repetition.

OK.

> diff --git a/userdiff.c b/userdiff.c
> index d71b82feb74..ff25cfc4b4c 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -293,7 +293,7 @@ PATTERNS("scheme",
>  	 "|([^][)(}{[ \t])+"),
>  PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
>  	 "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"),
> -{ "default", NULL, -1, { NULL, 0 } },
> +{ "default", NULL, NULL, -1, { NULL, 0 } },
>  };

I was surprised that there is so little damage to the built-in
userdiff driver definitions, but this is thanks to the PATTERNS()
and IPATTERN() macro that use designated initializers.  Very nice.

Nicely done.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux