Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, Oct 5, 2023 at 2:24 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Sergey Organov <sorganov@xxxxxxxxx> writes:
>
> > ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
> > +-m::
> > +     Show diffs for merge commits in the default format. This is
> > +     similar to '--diff-merges=on' (which see) except `-m` will
> > +     produce no output unless `-p` is given as well.
>
> I think the sentence reads better without the translated (q.v.) that
> confused Eric.

Agreed; confused me too.

> > +-c::
> > +     Produce combined diff output for merge commits.
> > +     Shortcut for '--diff-merges=combined -p'.
> > +
> > +--cc::
> > +     Produce dense combined diff output for merge commits.
> > +     Shortcut for '--diff-merges=dense-combined -p'.
>
> Good.
>
> > +--remerge-diff::
> > +     Produce diff against re-merge.
> > +     Shortcut for '--diff-merges=remerge -p'.
>
> I suspect that many people do not get what "re-merge" in "against
> re-merge" really is.  As "combined diff" and "dense combined diff"
> are not explained in the previous two entries either, and expect the
> readers to read the real description (which more or less matches
> what the original description for "-c" and "--cc" had, which is
> good), it would be better to say "Produce remerge-diff output for
> merge commits."  here, too.  It makes it consistent, and "for merge
> commits" makes it clear the "magic" does not apply to regular
> commits (which the above entries for "-c" and "--cc" do, which is
> very good).

Perhaps:

Produce remerge-diff output for merge commits, in order to show how
conflicts were resolved.

> > +separate::
> > +     Show full diff with respect to each of parents.
> > +     Separate log entry and diff is generated for each parent.
>
> In the early days of Git before -c/--cc were invented, we explained
> this mode as "pairwise comparison", and the phrase "pairwise" still
> may be the best one to describe the behaviour here.  In fact, we see
> in the updated description of combined below the exact phrase is used
> to refer to this oldest output format.
>
>     Show the `--patch` output pairwise, together with the commit
>     header, repeated for each parent for a merge commit.

I like this.

> or something, perhaps.  I added "repeated" here to make the contrast
> with "simultaneously" stand out.
>
> > +combined, c::
> > +     Show differences from each of the parents to the merge
> > +     result simultaneously instead of showing pairwise diff between
> > +     a parent and the result one at a time. Furthermore, it lists
> > +     only files which were modified from all parents.
> > ++
> > +dense-combined, cc::
> > +     Further compress output produced by `--diff-merges=combined`
> > +     by omitting uninteresting hunks whose contents in the parents
> > +     have only two variants and the merge result picks one of them
> > +     without modification.
> > ++
> > +remerge, r::
> > +     Remerge two-parent merge commits to create a temporary tree
> > +     object--potentially containing files with conflict markers
> > +     and such.  A diff is then shown between that temporary tree
> > +     and the actual merge commit.
>
> The original says "two-parent merge comimts are remerged" so it is
> not a failure of this patch, but the first verb "Remerge" sounds
> unnecessarily unfriendly to the readers.
>
>         For a two-parent merge commit, a merge of these two commits
>         is retried to create a temporary tree object, potentially
>         containing files with conflict markers.  A `--patch` output
>         then is shown between ...
>
> would be easier to follow and more faithful to the original
> description added by db757e8b (show, log: provide a --remerge-diff
> capability, 2022-02-02).

I like it.  Perhaps it may also benefit from explaining why this mode
is useful as well:

    For a two-parent merge commit, a merge of these two commits is
    retried to create a temporary tree object, potentially containing
    files with conflict markers.  A diff is then shown between that
    temporary tree and the actual merge commit.  This has the effect
    of showing whether and how both semantic and textual conflicts
    were resolved by the user (i.e. what changes the user made after
    running 'git merge' and before finally committing).

> Either way, it makes readers wonder what happens to merges with more
> than 2 parents (octopus merges).  It is not a new problem and this
> topic should not attempt to fix it.

We could add:

    For octopus merges (merges with more than two parents), currently
only shows a warning about skipping such commits.

if wanted.

But perhaps I've distracted too much from Sergey's topic, and I should
submit these wording tweaks as a patch on top?  I'm fine either way.

> [Footnote]
>
> * When a project allows fast-forward merges, something like this can
>   happen (and Git was _designed_ to allow and even encourage it)
>
>   - Linus pulls from Sergey and sees merge conflicts that are very
>     messy.  Sergey is asked to resolve the conflict, as Linus knows
>     Sergey understands the changes he is asking Linus to pull much
>     better than Linus does.
>
>   - Sergey does "git pull origin" that would give the same set of
>     conflicts Linus saw, perhaps ours/theirs sides swapped, resolves
>     the conflicts, and comits the merge result.  He may even add a
>     few other improvements on top (or may not).  He tells Linus that
>     his tree is ready to be pulled again.
>
>   - Linus pulls from Sergey again.  This time it is fast-forward,
>     without an extra merge commit that records the Linus's previous
>     tip as the first parent and Sergey's work as the second parent.
>
>   - Linus continues working from here.
>
>   In such a workflow, merges are nothing more than "combining
>   multiple histories together" and the first parenthood is NOT
>   inherently special among parents at all.  The original "-m -p"
>   (aka "pairwise diff") output reflects this world view and ensures
>   that all parents are shown more or less as equals (yes, the first
>   parent diff is shown first before the other parents, but you
>   cannot avoid it when outputting to a single dimension medium).
>
>   This world view was the only world view Git supported, until I
>   added the "--first-parent" traversal in 0053e902 (git-log
>   --first-parent: show only the first parent log, 2007-03-13).
>
>   With the "--first-parent", with "--no-ff" option to "git merge", a
>   different world view becomes possible.  A merge is not merely
>   combining multiple histories, which are equals.  It is bringing
>   work done on a side branch into the trunk.  To see the overview of
>   the history, "git log --first-parent" would give the outline,
>   which would be full of merges from side branches, each of which
>   can be seen as summarizing the work done on the side branch that
>   was merged, and it may occasionally have single-parent commits
>   that are hotfixes or trivial clean-ups or project administrivia
>   commits.  With "-p", "git log" would show the changes the work
>   done on a side branch as a single unit for a merge, and individual
>   commits if they are single-parent.  The life is good.
>
>   It all breaks down if the "diff against the first parent" is done
>   on a merge that is not bringing the work on a side branch in to
>   the trunk.  The merge done in the second step Sergey did for Linus
>   in the above example will have his work on the history leading to
>   its first parent, and from the overall project's point of view,
>   the second parent is the tip of the history of the trunk.  Showing
>   first-parent diff for a merge that was *not* discovered via the
>   first-parent traversal would show such a meaningless patch.  This
>   is an illustration of the fallout from mixing two incompatible
>   world views together, "--diff-merges=first-parent" wants to work
>   in a world where the first-parent is special among parents, but
>   traversal without "--first-parent" wants to treat all the branches
>   equally.
>
>   All the other <format>s accepted by the "--diff-merges=<format>"
>   option are symmetrical and they work equally well when in a
>   history of a project that considers the first-parenthood special
>   (i.e. work on a side branch is brought into the trunk history) or
>   in a history with merges whose parent order should not matter, so
>   unlike "--diff-merges=first-parent", it makes sense to apply them
>   with or without first-parent traversal.  It however is not true
>   for the "--diff-merges=first-parent" variant, which is asymmetric.
>
>   And that is why I think use of "--diff-merges=first-parent"
>   without "--first-parent" traversal is a bad thing to teach users
>   to use.

Thanks for writing this up.  In the past, I didn't know how to put
into words why I didn't particularly care for this mode.  You explain
it rather well.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux