On Fri, Jan 7, 2022 at 7:30 AM Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > > Hi Elijah, > > On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@xxxxxxxxx> > > > > This adds the ability to perform real merges rather than just trivial > > merges (meaning handling three way content merges, recursive ancestor > > consolidation, renames, proper directory/file conflict handling, and so > > forth). However, unlike `git merge`, the working tree and index are > > left alone and no branch is updated. > > > > The only output is: > > - the toplevel resulting tree printed on stdout > > - exit status of 0 (clean) or 1 (conflicts present) > > > > This output is mean to be used by some higher level script, perhaps in a > ^^^^ > > My apologies for pointing out a grammar issue: This probably intended to > say "meant", as the word "mean" changes the sense of the sentence. Oops. Yeah, I'll correct that; thanks for pointing it out. > In my defense, I have more substantial suggestions below. > > > sequence of steps like this: > > > > NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2) > > test $? -eq 0 || die "There were conflicts..." > > NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2) > > git update-ref $BRANCH1 $NEWCOMMIT > > > > Note that higher level scripts may also want to access the > > conflict/warning messages normally output during a merge, or have quick > > access to a list of files with conflicts. That is not available in this > > preliminary implementation, but subsequent commits will add that > > ability. > > > > Signed-off-by: Elijah Newren <newren@xxxxxxxxx> > > --- > > Documentation/git-merge-tree.txt | 28 +++++++---- > > builtin/merge-tree.c | 55 +++++++++++++++++++++- > > t/t4301-merge-tree-real.sh | 81 ++++++++++++++++++++++++++++++++ > > 3 files changed, 153 insertions(+), 11 deletions(-) > > create mode 100755 t/t4301-merge-tree-real.sh > > > > diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt > > index 58731c19422..5823938937f 100644 > > --- a/Documentation/git-merge-tree.txt > > +++ b/Documentation/git-merge-tree.txt > > @@ -3,26 +3,34 @@ git-merge-tree(1) > > > > NAME > > ---- > > -git-merge-tree - Show three-way merge without touching index > > +git-merge-tree - Perform merge without touching index or working tree > > > > > > SYNOPSIS > > -------- > > [verse] > > +'git merge-tree' --real <branch1> <branch2> > > 'git merge-tree' <base-tree> <branch1> <branch2> > > Here is an idea: How about aiming for this synopsis instead, exploiting > the fact that the "real" mode takes a different amount of arguments? My turn on the grammar thing: s/amount/number/. :-) > > 'git merge-tree' [--write-tree] <branch1> <branch2> > 'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2> > > That way, the old mode can still function, and can even at some stage be > deprecated and eventually removed. Ooh, interesting. > > > > DESCRIPTION > > ----------- > > -Reads three tree-ish, and output trivial merge results and > > -conflicting stages to the standard output. This is similar to > > -what three-way 'git read-tree -m' does, but instead of storing the > > -results in the index, the command outputs the entries to the > > -standard output. > > +Performs a merge, but does not make any new commits and does not read > > +from or write to either the working tree or index. > > > > -This is meant to be used by higher level scripts to compute > > -merge results outside of the index, and stuff the results back into the > > -index. For this reason, the output from the command omits > > -entries that match the <branch1> tree. > > +The first form will merge the two branches, doing a full recursive > > +merge with rename detection. If the merge is clean, the exit status > > +will be `0`, and if the merge has conflicts, the exit status will be > > +`1`. The output will consist solely of the resulting toplevel tree > > +(which may have files including conflict markers). > > + > > +The second form is meant for backward compatibility and will only do a > > +trival merge. It reads three tree-ish, and outputs trivial merge > > +results and conflicting stages to the standard output in a semi-diff > > +format. Since this was designed for higher level scripts to consume > > +and merge the results back into the index, it omits entries that match > > +<branch1>. The result of this second form is is similar to what > > +three-way 'git read-tree -m' does, but instead of storing the results > > +in the index, the command outputs the entries to the standard output. > > > > GIT > > --- > > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c > > index e1d2832c809..ac50f3d108b 100644 > > --- a/builtin/merge-tree.c > > +++ b/builtin/merge-tree.c > > @@ -2,6 +2,9 @@ > > #include "builtin.h" > > #include "tree-walk.h" > > #include "xdiff-interface.h" > > +#include "help.h" > > +#include "commit-reach.h" > > +#include "merge-ort.h" > > #include "object-store.h" > > #include "parse-options.h" > > #include "repository.h" > > @@ -392,7 +395,57 @@ struct merge_tree_options { > > static int real_merge(struct merge_tree_options *o, > > const char *branch1, const char *branch2) > > { > > - die(_("real merges are not yet implemented")); > > + struct commit *parent1, *parent2; > > + struct commit_list *common; > > + struct commit_list *merge_bases = NULL; > > + struct commit_list *j; > > + struct merge_options opt; > > + struct merge_result result = { 0 }; > > + > > + parent1 = get_merge_parent(branch1); > > + if (!parent1) > > + help_unknown_ref(branch1, "merge", > > + _("not something we can merge")); > > + > > + parent2 = get_merge_parent(branch2); > > + if (!parent2) > > + help_unknown_ref(branch2, "merge", > > + _("not something we can merge")); > > + > > + init_merge_options(&opt, the_repository); > > + /* > > + * TODO: Support subtree and other -X options? > > + if (use_strategies_nr == 1 && > > + !strcmp(use_strategies[0]->name, "subtree")) > > + opt.subtree_shift = ""; > > + for (x = 0; x < xopts_nr; x++) > > + if (parse_merge_opt(&opt, xopts[x])) > > + die(_("Unknown strategy option: -X%s"), xopts[x]); > > + */ > > + > > + opt.show_rename_progress = 0; > > + > > + opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */ > > + opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */ > > + > > + /* > > + * Get the merge bases, in reverse order; see comment above > > + * merge_incore_recursive in merge-ort.h > > + */ > > + common = get_merge_bases(parent1, parent2); > > + for (j = common; j; j = j->next) > > + commit_list_insert(j->item, &merge_bases); > > + > > + /* > > + * TODO: notify if merging unrelated histories? > > I guess that it would make most sense to add a flag whether this is > allowed or not, and I would suggest the default to be `off`. Sounds fair. Thanks for commenting on one of the TODOs that I was unsure about. > > + if (!common) > > + fprintf(stderr, _("merging unrelated histories")); > > + */ > > + > > + merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result); > > + printf("%s\n", oid_to_hex(&result.tree->object.oid)); > > + merge_switch_to_result(&opt, NULL, &result, 0, 0); > > This looks to be idempotent to `merge_finalize(&opt, &result)`, so maybe > use that instead? Yeah, and add a TODO about the display messages (that'll be addressed in a later patch, unlike the above TODOs). > > > + return result.clean ? 0 : 1; > > } > > > > int cmd_merge_tree(int argc, const char **argv, const char *prefix) > > diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh > > new file mode 100755 > > index 00000000000..f7aa310f8c1 > > --- /dev/null > > +++ b/t/t4301-merge-tree-real.sh > > @@ -0,0 +1,81 @@ > > +#!/bin/sh > > + > > +test_description='git merge-tree --real' > > + > > +. ./test-lib.sh > > + > > +# This test is ort-specific > > +GIT_TEST_MERGE_ALGORITHM=ort > > +export GIT_TEST_MERGE_ALGORITHM > > It might make sense to skip the entire test if the user asked for > `recursive` to be tested: > > test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || > skip_all="GIT_TEST_MERGE_ALGORITHM != ort" > test_done > } The idea makes sense, but it took me a bit to understand this code block. I think you're just missing an opening left curly brace right after the '||'? > > + > > +test_expect_success setup ' > > + test_write_lines 1 2 3 4 5 >numbers && > > + echo hello >greeting && > > + echo foo >whatever && > > + git add numbers greeting whatever && > > + git commit -m initial && > > I would really like to encourage the use of `test_tick`. It makes the > commit consistent, just in case you run into an issue that depends on some > hash order. I've used test_tick before, but I already know this test can't depend on hash order. Further, the hashes in the output are also replaced before comparing in order to make the tests also work as-is under sha256. So the tests are explicitly ignoring precise hashes. As such, I'm not sure I see the value of test_tick here. > > + > > + git branch side1 && > > + git branch side2 && > > + > > + git checkout side1 && > > Please use `git switch -c side1` or `git checkout -b side1`: it is more > compact than `git branch ... && git checkout ...`. Yes, but less forgiving to later modification where I go and add additional commits on one of the sides, because... > > > + test_write_lines 1 2 3 4 5 6 >numbers && > > + echo hi >greeting && > > + echo bar >whatever && > > + git add numbers greeting whatever && > > + git commit -m modify-stuff && > > + > > + git checkout side2 && > > This could be written as `git checkout -b side2 HEAD^`, to make the setup > more succinct. ...the presumption of HEAD^ is hardcoded and has to be parsed by readers to understand the test. It felt like more cognitive overhead to me, in addition to being less malleable. > > + test_write_lines 0 1 2 3 4 5 >numbers && > > + echo yo >greeting && > > + git rm whatever && > > + mkdir whatever && > > + >whatever/empty && > > + git add numbers greeting whatever/empty && > > + git commit -m other-modifications > > +' > > + > > +test_expect_success 'Content merge and a few conflicts' ' > > + git checkout side1^0 && > > + test_must_fail git merge side2 && > > + cp .git/AUTO_MERGE EXPECT && > > + E_TREE=$(cat EXPECT) && > > The file `EXPECT` is not used below. And can we use a more obvious name? > SOmething like: > > expected_tree=$(cat .git/AUTO_MERGE) There go my beautiful <80 character lines below. :-( But on a more serious note, yeah this is probably better. I'll change it. :-) > > > + git reset --hard && > > For an extra bonus, we could delay this via `test_when_finished`, to prove > that `git merge-tree --real` works even in a dirty worktree _with > conflicts_. Ooh, good thought. I like that. > > > + test_must_fail git merge-tree --real side1 side2 >RESULT && > > + R_TREE=$(cat RESULT) && > > How about `actual_tree` instead? But my 80-characters rev-parse lines....waaah. Just kidding, yeah this would be better. > > + > > + # Due to differences of e.g. "HEAD" vs "side1", the results will not > > + # exactly match. Dig into individual files. > > + > > + # Numbers should have three-way merged cleanly > > + test_write_lines 0 1 2 3 4 5 6 >expect && > > + git show ${R_TREE}:numbers >actual && > > + test_cmp expect actual && > > + > > + # whatever and whatever~<branch> should have same HASHES > > + git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect && > > + git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual && > > + test_cmp expect actual && > > + > > + # greeting should have a merge conflict > > + git show ${E_TREE}:greeting >tmp && > > + cat tmp | sed -e s/HEAD/side1/ >expect && > > + git show ${R_TREE}:greeting >actual && > > + test_cmp expect actual > > +' > > + > > +test_expect_success 'Barf on misspelled option' ' > > + # Mis-spell with single "s" instead of double "s" > > + test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect && > > + > > + grep "error: unknown option.*mesages" expect > > +' > > I do not think that this test case adds much, and we already test the > `parse_options()` machinery elsewhere. It's more about verifying that exit codes of 0 & 1 are reserved for "completed with no conflicts" and "completed with conflicts". The 129 bit in this test is the important bit (and perhaps is well-known to lots of other folks, but I thought it was worth highlighting). That said, I did a bad job mentioning that in the test description; I'll fix it up. > > + > > +test_expect_success 'Barf on too many arguments' ' > > + test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect && > > + > > + grep "^usage: git merge-tree" expect > > +' > > + > > +test_done > > The rest looks awesome. Thank you for working on it! I will definitely > come back to review the rest (have to take a break now), and then probably > add quite a bit of food for thought based on my experience _actually_ > using `merge-ort` on the server-side. Stay tuned. Ooh, I'm intrigued. And thanks for reviewing!