Re: [PATCH 3/3] merge-ort: implement merge_incore_recursive()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 15, 2020 at 8:09 PM Elijah Newren <newren@xxxxxxxxx> wrote:
>
> On Tue, Dec 15, 2020 at 6:07 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> >
> > "Elijah Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> >
> > > +/*
> > > + * Originally from merge_recursive_internal(); somewhat adapted, though.
> > > + */
> > > +static void merge_ort_internal(struct merge_options *opt,
> > > +                            struct commit_list *merge_bases,
> > > +                            struct commit *h1,
> > > +                            struct commit *h2,
> > > +                            struct merge_result *result)
> > > +{
> > > +     struct commit_list *iter;
> > > +     struct commit *merged_merge_bases;
> > > +     const char *ancestor_name;
> > > +     struct strbuf merge_base_abbrev = STRBUF_INIT;
> > > +
> > > +     if (!merge_bases) {
> > > +             merge_bases = get_merge_bases(h1, h2);
> > > +             merge_bases = reverse_commit_list(merge_bases);
> >
> > Do we want to say why we reverse here, or is the reason so well
> > known why in the original merge-recursive case?
>
> Oh, good point.  After starting on merge-ort, I shifted back and forth
> between it and cleaning up merge-recursive for a while...and it looks
> like this is one of the things I forgot to copy over.  The reason was
> totally opaque to me until Johannes spelled it out over here:
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@xxxxxxxxxxxxxxxxx/
>
> Note that the same reversing of merge bases is done in builtin/merge.c
> and sequencer.c as well.  It resulted in me adding the following note
> to the declaration of merge_recursive() in merge-recursive.h:
>
>  * NOTE: empirically, about a decade ago it was determined that with more
>  *       than two merge bases, optimal behavior was found when the
>  *       merge_bases were passed in the order of oldest commit to newest
>  *       commit.  Also, merge_bases will be consumed (emptied) so make a
>  *       copy if you need it.
>
> but I never copied that comment over to merge_incore_recursive().  I
> should do that, and perhaps reference that comment at this point in
> the code.
>
> > > +     }
> > > +
> > > +     merged_merge_bases = pop_commit(&merge_bases);
> > > +     if (merged_merge_bases == NULL) {
> > > +             /* if there is no common ancestor, use an empty tree */
> > > +             struct tree *tree;
> > > +
> > > +             tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
> > > +             merged_merge_bases = make_virtual_commit(opt->repo, tree,
> > > +                                                      "ancestor");
> > > +             ancestor_name = "empty tree";
> > > +     } else if (opt->ancestor && !opt->priv->call_depth) {
> > > +             ancestor_name = opt->ancestor;
> > > +     } else if (merge_bases) {
> > > +             ancestor_name = "merged common ancestors";
> > > +     } else {
> > > +             strbuf_add_unique_abbrev(&merge_base_abbrev,
> > > +                                      &merged_merge_bases->object.oid,
> > > +                                      DEFAULT_ABBREV);
> > > +             ancestor_name = merge_base_abbrev.buf;
> > > +     }
> >
> > So, up to this point we learned:
> >
> >  - merge bases either given by the caller, or computed from h1 and
> >    h2 when the caller just told us to figure it out ourselves.
> >
> >  - if we have
> >
> >    - 0 merge base between h1 and h2, in which case we would use an
> >      empty tree as an imaginary common
> >
> >    - 1 merge base between h1 and h2, in which case the common
> >      ancestor of the resuting merge between h1 and h2 is that single
> >      merge base
> >
> >    - 2 or more bases, in which case we'd use that would eventually
> >      come back when we merged recursively all bases.
> >
> > and the primary products of the above procedure are
> >
> >  - ancestor_name (the string used when presenting conflicts while
> >    merging h1 and h2)
> >
> >  - merged_merge_bases (one starting commit among the merge bases)
> >
> > And then the loop will iterate over the remaining merge bases,
> > merging one popped from it in the current merged_merge_bases,
> > until we run out.  At that point when we leave the loop, we'd
> > have merged_merge_bases that is a virtual commit to be used as
> > a single merge base to use while merging trees of h1 and h2.
> >
> > > +     for (iter = merge_bases; iter; iter = iter->next) {
> > > +             const char *saved_b1, *saved_b2;
> > > +             struct commit *prev = merged_merge_bases;
> > > +
> > > +             opt->priv->call_depth++;
> > > +             /*
> > > +              * When the merge fails, the result contains files
> > > +              * with conflict markers. The cleanness flag is
> > > +              * ignored (unless indicating an error), it was never
> > > +              * actually used, as result of merge_trees has always
> > > +              * overwritten it: the committed "conflicts" were
> > > +              * already resolved.
> > > +              */
> > > +             saved_b1 = opt->branch1;
> > > +             saved_b2 = opt->branch2;
> > > +             opt->branch1 = "Temporary merge branch 1";
> > > +             opt->branch2 = "Temporary merge branch 2";
> > > +             merge_ort_internal(opt, NULL, prev, iter->item, result);
> > > +             if (result->clean < 0)
> > > +                     return;
> > > +             opt->branch1 = saved_b1;
> > > +             opt->branch2 = saved_b2;
> > > +             opt->priv->call_depth--;
> > > +
> > > +             merged_merge_bases = make_virtual_commit(opt->repo,
> > > +                                                      result->tree,
> > > +                                                      "merged tree");
> > > +             commit_list_insert(prev, &merged_merge_bases->parents);
> > > +             commit_list_insert(iter->item,
> > > +                                &merged_merge_bases->parents->next);
> >
> > We need to record these parents because...?  When merged_merge_bases
> > we just created is used as one side of a merge in the next iteration,
> > we'd need to compute the merge base between it and the one we'd pop
> > out of merge_bases, and that is why.
> >
> > > +             clear_or_reinit_internal_opts(opt->priv, 1);
> > > +     }
> >
> > OK.  I think I understood this loop.  It looks mostly straight-forward.
> >
> > > +     opt->ancestor = ancestor_name;
> >
> > And the label to be used, that was computed before the above loop,
> > is used here...
> >
> > > +     merge_ort_nonrecursive_internal(opt,
> > > +                                     repo_get_commit_tree(opt->repo,
> > > +                                                          merged_merge_bases),
> > > +                                     repo_get_commit_tree(opt->repo, h1),
> > > +                                     repo_get_commit_tree(opt->repo, h2),
> > > +                                     result);
> >
> > ... to finally compute the 3-way merge between h1 and h2.
> >
> > > +     strbuf_release(&merge_base_abbrev);
> >
> > And the storage that may have been holding the .ancestor name is
> > cleared, as we no longer need it.
> >
> > > +     opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
> > > +}
> > > +
> > >  void merge_incore_nonrecursive(struct merge_options *opt,
> > >                              struct tree *merge_base,
> > >                              struct tree *side1,
> > > @@ -1493,7 +1577,9 @@ void merge_incore_recursive(struct merge_options *opt,
> > >                           struct commit *side2,
> > >                           struct merge_result *result)
> > >  {
> > > -     (void)reverse_commit_list;
> > > -     (void)make_virtual_commit;
> > > -     die("Not yet implemented");
> > > +     assert(opt->ancestor == NULL ||
> > > +            !strcmp(opt->ancestor, "constructed merge base"));
> >
> > Where does this string come from?  The recursive backend does
> > asssign a fixed string with that value to opt->ancestor, but we
> > don't expect that string to come here, no?
>
> It's specifically the merge_recursive_generic() function from
> merge-recursive.c that sets this, which was part of the
> merge-recursive API.  merge-ort does not (yet?) have an equivalent
> function (anything calling merge_recursive_generic() just can't use
> merge-ort right now -- a list that includes 'am', 'stash', and
> 'merge-recursive').  For now, I am just letting those callers continue
> to use merge-recursive.c.  I never figured out if I wanted to make
> that function part of merge-ort's API, whether I just wanted to add a
> wrapper to merge-ort-wrappers.[ch] for it, or if I should rewrite the
> callers to do something else.
>
> However, looking a little closer, the name for opt->ancestor is
> slightly phony -- I think it only makes sense for "am", not for either
> of "stash" or "merge-recursive".  Perhaps I should instead count the
> number of merge_bases, and assert that either opt->ancestor == NULL
> (exclusive)OR  num_merge_bases == 1.  merge_recursive_generic() should

No, this isn't right.  Most callers just grab all the merge bases and
often there happens to be only one.  In such a case, the caller
shouldn't have to set opt->ancestor; the fallback of referencing the
commit id is perfectly good there.  It's just that when there is
exactly one merge base, a special caller like
merge_recursive_generic() might want to override the value for
opt->ancestor.

Anyway, I'll post a new patch soon.

> also be made to stop setting opt->ancestor, and then callers like
> "am", "stash", and "merge-recursive" should be responsible to provide
> a reasonable ancestor name for merge.conflictStyle=diff3 to use when
> it's clear they are providing the sole ancestor.
>
> Then at some point I can decide what to do with
> merge_recursive_generic().  I'll probably just make it a wrapper at
> some point; since that lets me kick the can down the road even
> further.  :-)
>
> > > +     merge_start(opt, result);
> > > +     merge_ort_internal(opt, merge_bases, side1, side2, result);
> > >  }



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux