On 11/24/2020 6:53 AM, Alban Gruin wrote: > This rewrites `git merge-octopus' from shell to C. As for the two last > conversions, this port removes calls to external processes to avoid > reading and writing the index over and over again. ... > diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c > new file mode 100644 > index 0000000000..ca8f9f345d > --- /dev/null > +++ b/builtin/merge-octopus.c > @@ -0,0 +1,69 @@ > +/* > + * Builtin "git merge-octopus" > + * > + * Copyright (c) 2020 Alban Gruin > + * > + * Based on git-merge-octopus.sh, written by Junio C Hamano. > + * > + * Resolve two or more trees. > + */ > + > +#define USE_THE_INDEX_COMPATIBILITY_MACROS Hm. It would be best if this was not added to any new code. Please squash in changes that avoid these macros. > +#include "cache.h" > +#include "builtin.h" > +#include "commit.h" > +#include "merge-strategies.h" > + > +static const char builtin_merge_octopus_usage[] = > + "git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]"; > + > +int cmd_merge_octopus(int argc, const char **argv, const char *prefix) > +{ > + int i, sep_seen = 0; One strategy I've been trying to do, when I remember, is to start each builtin with struct repository *repo = the_repository; then use 'repo' over 'the_repository' and other macros. This gets us closer to the state where the builtin cmd_*() methods could take a repository pointer as a parameter. (We are a ways off, but every little bit helps, right?) > + /* > + * Reject if this is not an octopus -- resolve should be used > + * instead. > + */ > + if (commit_list_count(remotes) < 2) > + return 2; This caused me to pause, since it might be nice to have a warning message here. However, it is identical behavior to the script, including the comment. It appears that there is no Documentation/git-merge-octopus.txt, but such a doc file would want to include the meaning of exit code 2. > diff --git a/merge-strategies.c b/merge-strategies.c > index 9aa07e91b5..4d9dd55296 100644 > --- a/merge-strategies.c > +++ b/merge-strategies.c > @@ -1,5 +1,6 @@ > #include "cache.h" > #include "cache-tree.h" > +#include "commit-reach.h" You had my curiosity, but now you have my attention. ;) > +int merge_strategies_octopus(struct repository *r, > + struct commit_list *bases, const char *head_arg, > + struct commit_list *remotes) > +{ > + int ff_merge = 1, ret = 0, references = 1; > + struct commit **reference_commit, *head_commit; 'reference_commit' might be clearer if it was plural, right? > + struct tree *reference_tree, *head_tree; > + struct commit_list *i; > + struct object_id head; > + struct strbuf sb = STRBUF_INIT; > + > + get_oid(head_arg, &head); > + head_commit = lookup_commit_reference(r, &head); > + head_tree = repo_get_commit_tree(r, head_commit); > + > + if (parse_tree(head_tree)) > + return 2; > + > + if (repo_index_has_changes(r, head_tree, &sb)) { > + error(_("Your local changes to the following files " > + "would be overwritten by merge:\n %s"), > + sb.buf); > + strbuf_release(&sb); > + return 2; > + } > + > + reference_commit = xcalloc(commit_list_count(remotes) + 1, > + sizeof(struct commit *)); > + reference_commit[0] = head_commit; > + reference_tree = head_tree; > + > + for (i = remotes; i && i->item; i = i->next) { > + struct commit *c = i->item; > + struct object_id *oid = &c->object.oid; > + struct tree *current_tree = repo_get_commit_tree(r, c); > + struct commit_list *common, *j; > + char *branch_name; > + int k = 0, up_to_date = 0; > + > + if (ret) { > + /* > + * We allow only last one to have a > + * hand-resolvable conflicts. Last round failed > + * and we still had a head to merge. > + */ > + puts(_("Automated merge did not work.")); > + puts(_("Should not be doing an octopus.")); > + > + free(reference_commit); > + return 2; > + } > + > + branch_name = merge_get_better_branch_name(oid_to_hex(oid)); > + common = get_merge_bases_many(c, references, reference_commit); Here we are. You should probably use repo_get_merge_bases_many(). 'references' is not a list, but instead a count. Could it be renamed nr_references or something? > + > + if (!common) { > + error(_("Unable to find common commit with %s"), branch_name); > + > + free(branch_name); > + free_commit_list(common); > + free(reference_commit); > + > + return 2; hm. we are getting into magic constant territory. Perhaps this should be marked with a macro in merge-strategies.h? It could be used in the case of "only two heads" as well. > + } > + > + for (j = common; j && !(up_to_date || !ff_merge); j = j->next) { > + up_to_date |= oideq(&j->item->object.oid, oid); > + > + if (k < references) > + ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid); I'm confused about this line. Shouldn't we care only about reference_commit[references]? If we _do_ care about all possible reference_commit[k] values, then shouldn't this be a loop over the k values, not a single check per k (and advancing as we iterate through the results from common)? Seems we could use some test cases around criss-cross octopus merges (i.e. multiple merge bases). > + } > + > + if (up_to_date) { > + printf(_("Already up to date with %s\n"), branch_name); > + > + free(branch_name); > + free_commit_list(common); > + continue; > + } > + > + if (ff_merge) { > + ret = octopus_fast_forward(r, branch_name, head_tree, > + current_tree, &reference_tree); > + references = 0; > + } else { > + ret = octopus_do_merge(r, branch_name, common, > + current_tree, &reference_tree); > + } > + > + free(branch_name); > + free_commit_list(common); > + > + if (ret == -1) > + break; > + > + reference_commit[references++] = c; > + } > + > + free(reference_commit); > + return ret; > +} This patch could use a little work, but it's a good start. Thanks, -Stolee