Re: [RFC] blame: new option to better handle merged cherry-picks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Bernhard R. Link" <brl+git@xxxxxxxxxxxxxx> writes:

> Allows to disable the git blame optimization of assuming that if there is a
> parent of a merge commit that has the exactly same file content, then
> only this parent is to be looked at.

I think this is what we usually call --full-history in "git log"
family, but more importantly, I do not think this is solving a valid
problem.

> This optimization, while being faster in the usual case, means that in
> the case of cherry-picks the blamed commit depends on which other commits
> touched a file.
>
> If for example one commit A modified both files b and c. And there are
> commits B and C, B only modifies file b and C only modifies file c
> (so that no conflicts happen), and assume A is cherry-picked as A'
> and the two branches then merged:
>
> --o-----B---A
>    \         \
>     ---C---A'--M---

So the contents of b at M is as the same as in A, so following 'b'
will see A and B changed that path, which is correct.

The contents of c at M is?  It is different from A because at A c
lacks the change made to it at C.  The merged result at M would
match C in A', no?  So following 'c' will see A' and C changed that
path, no?

So what is wrong about it?  If the original history were like this
instead, and A' were a cherry-pick of A, then what should happen?

> --o-----B---A'
>    \         \
>     ---C---A---M---

Don't we want to see c blamed the same way?

Also, when handling a merge, we have to handle parents sequencially,
checking the difference between M with its first parent first, and
then passing blame for the remaining common lines to the remaining
parents.  If you flip the order of parents of M when you merge A and
A' in your original history, and with your patch, what would you
see when you blame c?  Wouldn't it notice that M:c is identical to c
in its first parent (now A') and pass the whole blame to A' anyway
with or without your change?



> Then without this new option git blame blames the A|A' changes of
> file b to A while blaming the changes of c to A'.
> With the new option --no-parent-shortcut it blames both changes to the
> same commit.
>
> Signed-off-by: Bernhard R. Link <brlink@xxxxxxxxxx>
> ---
>  Documentation/blame-options.txt | 6 ++++++
>  builtin/blame.c                 | 5 ++++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt
> index 0cebc4f..55dd12b 100644
> --- a/Documentation/blame-options.txt
> +++ b/Documentation/blame-options.txt
> @@ -48,6 +48,12 @@ include::line-range-format.txt[]
>  	Show the result incrementally in a format designed for
>  	machine consumption.
>  
> +--no-parent-shortcut::
> +	Always look at all parents of a merge and do not shortcut
> +	to the first parent with no changes to the file looked at.
> +	This takes more time but produces more reliable results
> +	if branches with cherry-picked commits were merged.
> +
>  --encoding=<encoding>::
>  	Specifies the encoding used to output author names
>  	and commit summaries. Setting it to `none` makes blame
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 4916eb2..dab2c36 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -45,6 +45,7 @@ static int incremental;
>  static int xdl_opts;
>  static int abbrev = -1;
>  static int no_whole_file_rename;
> +static int no_parent_shortcut;
>  
>  static enum date_mode blame_date_mode = DATE_ISO8601;
>  static size_t blame_date_width;
> @@ -1248,7 +1249,8 @@ static void pass_blame(struct scoreboard *sb, struct origin *origin, int opt)
>  			porigin = find(sb, p, origin);
>  			if (!porigin)
>  				continue;
> -			if (!hashcmp(porigin->blob_sha1, origin->blob_sha1)) {
> +			if (!no_parent_shortcut &&
> +			    !hashcmp(porigin->blob_sha1, origin->blob_sha1)) {
>  				pass_whole_blame(sb, origin, porigin);
>  				origin_decref(porigin);
>  				goto finish;
> @@ -2247,6 +2249,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	static const char *contents_from = NULL;
>  	static const struct option options[] = {
>  		OPT_BOOL(0, "incremental", &incremental, N_("Show blame entries as we find them, incrementally")),
> +		OPT_BOOL(0, "no-parent-shortcut", &no_parent_shortcut, N_("Don't take shortcuts in some merges but handle cherry-picks better")),
>  		OPT_BOOL('b', NULL, &blank_boundary, N_("Show blank SHA-1 for boundary commits (Default: off)")),
>  		OPT_BOOL(0, "root", &show_root, N_("Do not treat root commits as boundaries (Default: off)")),
>  		OPT_BOOL(0, "show-stats", &show_stats, N_("Show work cost statistics")),
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]