Re: [PATCH v2 4/8] merge-tree: implement real merges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elijah,

On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@xxxxxxxxx>
>
> This adds the ability to perform real merges rather than just trivial
> merges (meaning handling three way content merges, recursive ancestor
> consolidation, renames, proper directory/file conflict handling, and so
> forth).  However, unlike `git merge`, the working tree and index are
> left alone and no branch is updated.
>
> The only output is:
>   - the toplevel resulting tree printed on stdout
>   - exit status of 0 (clean) or 1 (conflicts present)
>
> This output is mean to be used by some higher level script, perhaps in a
                 ^^^^

My apologies for pointing out a grammar issue: This probably intended to
say "meant", as the word "mean" changes the sense of the sentence.

In my defense, I have more substantial suggestions below.

> sequence of steps like this:
>
>    NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2)
>    test $? -eq 0 || die "There were conflicts..."
>    NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
>    git update-ref $BRANCH1 $NEWCOMMIT
>
> Note that higher level scripts may also want to access the
> conflict/warning messages normally output during a merge, or have quick
> access to a list of files with conflicts.  That is not available in this
> preliminary implementation, but subsequent commits will add that
> ability.
>
> Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> ---
>  Documentation/git-merge-tree.txt | 28 +++++++----
>  builtin/merge-tree.c             | 55 +++++++++++++++++++++-
>  t/t4301-merge-tree-real.sh       | 81 ++++++++++++++++++++++++++++++++
>  3 files changed, 153 insertions(+), 11 deletions(-)
>  create mode 100755 t/t4301-merge-tree-real.sh
>
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index 58731c19422..5823938937f 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -3,26 +3,34 @@ git-merge-tree(1)
>
>  NAME
>  ----
> -git-merge-tree - Show three-way merge without touching index
> +git-merge-tree - Perform merge without touching index or working tree
>
>
>  SYNOPSIS
>  --------
>  [verse]
> +'git merge-tree' --real <branch1> <branch2>
>  'git merge-tree' <base-tree> <branch1> <branch2>

Here is an idea: How about aiming for this synopsis instead, exploiting
the fact that the "real" mode takes a different amount of arguments?

   'git merge-tree' [--write-tree] <branch1> <branch2>
   'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>

That way, the old mode can still function, and can even at some stage be
deprecated and eventually removed.

>
>  DESCRIPTION
>  -----------
> -Reads three tree-ish, and output trivial merge results and
> -conflicting stages to the standard output.  This is similar to
> -what three-way 'git read-tree -m' does, but instead of storing the
> -results in the index, the command outputs the entries to the
> -standard output.
> +Performs a merge, but does not make any new commits and does not read
> +from or write to either the working tree or index.
>
> -This is meant to be used by higher level scripts to compute
> -merge results outside of the index, and stuff the results back into the
> -index.  For this reason, the output from the command omits
> -entries that match the <branch1> tree.
> +The first form will merge the two branches, doing a full recursive
> +merge with rename detection.  If the merge is clean, the exit status
> +will be `0`, and if the merge has conflicts, the exit status will be
> +`1`.  The output will consist solely of the resulting toplevel tree
> +(which may have files including conflict markers).
> +
> +The second form is meant for backward compatibility and will only do a
> +trival merge.  It reads three tree-ish, and outputs trivial merge
> +results and conflicting stages to the standard output in a semi-diff
> +format.  Since this was designed for higher level scripts to consume
> +and merge the results back into the index, it omits entries that match
> +<branch1>.  The result of this second form is is similar to what
> +three-way 'git read-tree -m' does, but instead of storing the results
> +in the index, the command outputs the entries to the standard output.
>
>  GIT
>  ---
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index e1d2832c809..ac50f3d108b 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -2,6 +2,9 @@
>  #include "builtin.h"
>  #include "tree-walk.h"
>  #include "xdiff-interface.h"
> +#include "help.h"
> +#include "commit-reach.h"
> +#include "merge-ort.h"
>  #include "object-store.h"
>  #include "parse-options.h"
>  #include "repository.h"
> @@ -392,7 +395,57 @@ struct merge_tree_options {
>  static int real_merge(struct merge_tree_options *o,
>  		      const char *branch1, const char *branch2)
>  {
> -	die(_("real merges are not yet implemented"));
> +	struct commit *parent1, *parent2;
> +	struct commit_list *common;
> +	struct commit_list *merge_bases = NULL;
> +	struct commit_list *j;
> +	struct merge_options opt;
> +	struct merge_result result = { 0 };
> +
> +	parent1 = get_merge_parent(branch1);
> +	if (!parent1)
> +		help_unknown_ref(branch1, "merge",
> +				 _("not something we can merge"));
> +
> +	parent2 = get_merge_parent(branch2);
> +	if (!parent2)
> +		help_unknown_ref(branch2, "merge",
> +				 _("not something we can merge"));
> +
> +	init_merge_options(&opt, the_repository);
> +	/*
> +	 * TODO: Support subtree and other -X options?
> +	if (use_strategies_nr == 1 &&
> +	    !strcmp(use_strategies[0]->name, "subtree"))
> +		opt.subtree_shift = "";
> +	for (x = 0; x < xopts_nr; x++)
> +		if (parse_merge_opt(&opt, xopts[x]))
> +			die(_("Unknown strategy option: -X%s"), xopts[x]);
> +	*/
> +
> +	opt.show_rename_progress = 0;
> +
> +	opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */
> +	opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */
> +
> +	/*
> +	 * Get the merge bases, in reverse order; see comment above
> +	 * merge_incore_recursive in merge-ort.h
> +	 */
> +	common = get_merge_bases(parent1, parent2);
> +	for (j = common; j; j = j->next)
> +		commit_list_insert(j->item, &merge_bases);
> +
> +	/*
> +	 * TODO: notify if merging unrelated histories?

I guess that it would make most sense to add a flag whether this is
allowed or not, and I would suggest the default to be `off`.

> +	if (!common)
> +		fprintf(stderr, _("merging unrelated histories"));
> +	 */
> +
> +	merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> +	printf("%s\n", oid_to_hex(&result.tree->object.oid));
> +	merge_switch_to_result(&opt, NULL, &result, 0, 0);

This looks to be idempotent to `merge_finalize(&opt, &result)`, so maybe
use that instead?

> +	return result.clean ? 0 : 1;
>  }
>
>  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> new file mode 100755
> index 00000000000..f7aa310f8c1
> --- /dev/null
> +++ b/t/t4301-merge-tree-real.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='git merge-tree --real'
> +
> +. ./test-lib.sh
> +
> +# This test is ort-specific
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM

It might make sense to skip the entire test if the user asked for
`recursive` to be tested:

	test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
		skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
		test_done
	}

> +
> +test_expect_success setup '
> +	test_write_lines 1 2 3 4 5 >numbers &&
> +	echo hello >greeting &&
> +	echo foo >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m initial &&

I would really like to encourage the use of `test_tick`. It makes the
commit consistent, just in case you run into an issue that depends on some
hash order.

> +
> +	git branch side1 &&
> +	git branch side2 &&
> +
> +	git checkout side1 &&

Please use `git switch -c side1` or `git checkout -b side1`: it is more
compact than `git branch ... && git checkout ...`.

> +	test_write_lines 1 2 3 4 5 6 >numbers &&
> +	echo hi >greeting &&
> +	echo bar >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m modify-stuff &&
> +
> +	git checkout side2 &&

This could be written as `git checkout -b side2 HEAD^`, to make the setup
more succinct.

> +	test_write_lines 0 1 2 3 4 5 >numbers &&
> +	echo yo >greeting &&
> +	git rm whatever &&
> +	mkdir whatever &&
> +	>whatever/empty &&
> +	git add numbers greeting whatever/empty &&
> +	git commit -m other-modifications
> +'
> +
> +test_expect_success 'Content merge and a few conflicts' '
> +	git checkout side1^0 &&
> +	test_must_fail git merge side2 &&
> +	cp .git/AUTO_MERGE EXPECT &&
> +	E_TREE=$(cat EXPECT) &&

The file `EXPECT` is not used below. And can we use a more obvious name?
SOmething like:

	expected_tree=$(cat .git/AUTO_MERGE)

> +	git reset --hard &&

For an extra bonus, we could delay this via `test_when_finished`, to prove
that `git merge-tree --real` works even in a dirty worktree _with
conflicts_.

> +	test_must_fail git merge-tree --real side1 side2 >RESULT &&
> +	R_TREE=$(cat RESULT) &&

How about `actual_tree` instead?

> +
> +	# Due to differences of e.g. "HEAD" vs "side1", the results will not
> +	# exactly match.  Dig into individual files.
> +
> +	# Numbers should have three-way merged cleanly
> +	test_write_lines 0 1 2 3 4 5 6 >expect &&
> +	git show ${R_TREE}:numbers >actual &&
> +	test_cmp expect actual &&
> +
> +	# whatever and whatever~<branch> should have same HASHES
> +	git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect &&
> +	git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual &&
> +	test_cmp expect actual &&
> +
> +	# greeting should have a merge conflict
> +	git show ${E_TREE}:greeting >tmp &&
> +	cat tmp | sed -e s/HEAD/side1/ >expect &&
> +	git show ${R_TREE}:greeting >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'Barf on misspelled option' '
> +	# Mis-spell with single "s" instead of double "s"
> +	test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> +
> +	grep "error: unknown option.*mesages" expect
> +'

I do not think that this test case adds much, and we already test the
`parse_options()` machinery elsewhere.

> +
> +test_expect_success 'Barf on too many arguments' '
> +	test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect &&
> +
> +	grep "^usage: git merge-tree" expect
> +'
> +
> +test_done

The rest looks awesome. Thank you for working on it! I will definitely
come back to review the rest (have to take a break now), and then probably
add quite a bit of food for thought based on my experience _actually_
using `merge-ort` on the server-side. Stay tuned.

Thank you,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux