Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Elijah Newren <newren@xxxxxxxxx> writes:

> Hi,
>
> On Thu, Oct 5, 2023 at 2:24 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>
>> Sergey Organov <sorganov@xxxxxxxxx> writes:
>> > +--remerge-diff::
>> > +     Produce diff against re-merge.
>> > +     Shortcut for '--diff-merges=remerge -p'.
>>
>> I suspect that many people do not get what "re-merge" in "against
>> re-merge" really is.  As "combined diff" and "dense combined diff"
>> are not explained in the previous two entries either, and expect the
>> readers to read the real description (which more or less matches
>> what the original description for "-c" and "--cc" had, which is
>> good), it would be better to say "Produce remerge-diff output for
>> merge commits."  here, too.  It makes it consistent, and "for merge
>> commits" makes it clear the "magic" does not apply to regular
>> commits (which the above entries for "-c" and "--cc" do, which is
>> very good).
>
> Perhaps:
>
> Produce remerge-diff output for merge commits, in order to show how
> conflicts were resolved.

Will use this description in re-roll.

>
>> > +separate::
>> > +     Show full diff with respect to each of parents.
>> > +     Separate log entry and diff is generated for each parent.
>>
>> In the early days of Git before -c/--cc were invented, we explained
>> this mode as "pairwise comparison", and the phrase "pairwise" still
>> may be the best one to describe the behaviour here.  In fact, we see
>> in the updated description of combined below the exact phrase is used
>> to refer to this oldest output format.
>>
>>     Show the `--patch` output pairwise, together with the commit
>>     header, repeated for each parent for a merge commit.
>
> I like this.
>
>> or something, perhaps.  I added "repeated" here to make the contrast
>> with "simultaneously" stand out.

Please let's left it for some follow-up, as this patch does not rephrase
original, just changes the presentation.

>>
>> > +combined, c::
>> > +     Show differences from each of the parents to the merge
>> > +     result simultaneously instead of showing pairwise diff between
>> > +     a parent and the result one at a time. Furthermore, it lists
>> > +     only files which were modified from all parents.
>> > ++
>> > +dense-combined, cc::
>> > +     Further compress output produced by `--diff-merges=combined`
>> > +     by omitting uninteresting hunks whose contents in the parents
>> > +     have only two variants and the merge result picks one of them
>> > +     without modification.
>> > ++
>> > +remerge, r::
>> > +     Remerge two-parent merge commits to create a temporary tree
>> > +     object--potentially containing files with conflict markers
>> > +     and such.  A diff is then shown between that temporary tree
>> > +     and the actual merge commit.
>>
>> The original says "two-parent merge comimts are remerged" so it is
>> not a failure of this patch, but the first verb "Remerge" sounds
>> unnecessarily unfriendly to the readers.
>>
>>         For a two-parent merge commit, a merge of these two commits
>>         is retried to create a temporary tree object, potentially
>>         containing files with conflict markers.  A `--patch` output
>>         then is shown between ...
>>
>> would be easier to follow and more faithful to the original
>> description added by db757e8b (show, log: provide a --remerge-diff
>> capability, 2022-02-02).
>
> I like it.  Perhaps it may also benefit from explaining why this mode
> is useful as well:
>
>     For a two-parent merge commit, a merge of these two commits is
>     retried to create a temporary tree object, potentially containing
>     files with conflict markers.  A diff is then shown between that
>     temporary tree and the actual merge commit.  This has the effect
>     of showing whether and how both semantic and textual conflicts
>     were resolved by the user (i.e. what changes the user made after
>     running 'git merge' and before finally committing).
>
>> Either way, it makes readers wonder what happens to merges with more
>> than 2 parents (octopus merges).  It is not a new problem and this
>> topic should not attempt to fix it.
>
> We could add:
>
>     For octopus merges (merges with more than two parents), currently
> only shows a warning about skipping such commits.
>
> if wanted.
>
> But perhaps I've distracted too much from Sergey's topic, and I should
> submit these wording tweaks as a patch on top?  I'm fine either way.
>
>> [Footnote]
>>
>> * When a project allows fast-forward merges, something like this can
>>   happen (and Git was _designed_ to allow and even encourage it)
>>
>>   - Linus pulls from Sergey and sees merge conflicts that are very
>>     messy.  Sergey is asked to resolve the conflict, as Linus knows
>>     Sergey understands the changes he is asking Linus to pull much
>>     better than Linus does.
>>
>>   - Sergey does "git pull origin" that would give the same set of
>>     conflicts Linus saw, perhaps ours/theirs sides swapped, resolves
>>     the conflicts, and comits the merge result.  He may even add a
>>     few other improvements on top (or may not).  He tells Linus that
>>     his tree is ready to be pulled again.
>>
>>   - Linus pulls from Sergey again.  This time it is fast-forward,
>>     without an extra merge commit that records the Linus's previous
>>     tip as the first parent and Sergey's work as the second parent.
>>
>>   - Linus continues working from here.
>>
>>   In such a workflow, merges are nothing more than "combining
>>   multiple histories together" and the first parenthood is NOT
>>   inherently special among parents at all.  The original "-m -p"
>>   (aka "pairwise diff") output reflects this world view and ensures
>>   that all parents are shown more or less as equals (yes, the first
>>   parent diff is shown first before the other parents, but you
>>   cannot avoid it when outputting to a single dimension medium).
>>
>>   This world view was the only world view Git supported, until I
>>   added the "--first-parent" traversal in 0053e902 (git-log
>>   --first-parent: show only the first parent log, 2007-03-13).
>>
>>   With the "--first-parent", with "--no-ff" option to "git merge", a
>>   different world view becomes possible.  A merge is not merely
>>   combining multiple histories, which are equals.  It is bringing
>>   work done on a side branch into the trunk.  To see the overview of
>>   the history, "git log --first-parent" would give the outline,
>>   which would be full of merges from side branches, each of which
>>   can be seen as summarizing the work done on the side branch that
>>   was merged, and it may occasionally have single-parent commits
>>   that are hotfixes or trivial clean-ups or project administrivia
>>   commits.  With "-p", "git log" would show the changes the work
>>   done on a side branch as a single unit for a merge, and individual
>>   commits if they are single-parent.  The life is good.
>>
>>   It all breaks down if the "diff against the first parent" is done
>>   on a merge that is not bringing the work on a side branch in to
>>   the trunk.  The merge done in the second step Sergey did for Linus
>>   in the above example will have his work on the history leading to
>>   its first parent, and from the overall project's point of view,
>>   the second parent is the tip of the history of the trunk.  Showing
>>   first-parent diff for a merge that was *not* discovered via the
>>   first-parent traversal would show such a meaningless patch.  This
>>   is an illustration of the fallout from mixing two incompatible
>>   world views together, "--diff-merges=first-parent" wants to work
>>   in a world where the first-parent is special among parents, but
>>   traversal without "--first-parent" wants to treat all the branches
>>   equally.
>>
>>   All the other <format>s accepted by the "--diff-merges=<format>"
>>   option are symmetrical and they work equally well when in a
>>   history of a project that considers the first-parenthood special
>>   (i.e. work on a side branch is brought into the trunk history) or
>>   in a history with merges whose parent order should not matter, so
>>   unlike "--diff-merges=first-parent", it makes sense to apply them
>>   with or without first-parent traversal.  It however is not true
>>   for the "--diff-merges=first-parent" variant, which is asymmetric.
>>
>>   And that is why I think use of "--diff-merges=first-parent"
>>   without "--first-parent" traversal is a bad thing to teach users
>>   to use.
>
> Thanks for writing this up.  In the past, I didn't know how to put
> into words why I didn't particularly care for this mode.  You explain
> it rather well.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux