Re: [PATCH] range-diff: optionally include merge commits' diffs in the analysis

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hannes

On Fri, 8 Nov 2024, Johannes Sixt wrote:

> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> > +--diff-merges=<format>::
> > +	Instead of ignoring merge commits, generate diffs for them using the
> > +	corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > +	and include them in the comparison.
> > ++
> > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> > +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable?

There are a couple of reasons:

- Most importantly: I had not even studied that question in depth when I
  wrote the original patch, and not even when I submitted the first
  iteration.

  In my reviews of branch thickets, I used `--diff-merges=1` and it served
  my needs well. Therefore I can speak to this mode, but not really to the
  other modes.

- Which options are usable or not may lack a universally-correct answer.

  One person might find `--diff-merges=separate` confusing while the next
  person may find it quite useful.

- If the documentation points out an unsuitable mode, then an obvious
  follow-up wish would be to reject this mode in the command, with an
  error message. Some users might, however, need precisely this mode, so
  I am loathe to do that.

Maybe this compromise: Change the "Note:" to recommend the `first-parent`
mode, with a bit more explanation why that is so (it is consistent with
the idea that a merge is kind of a "meta patch", comprising all the merged
commits' patches, which is equivalent to the first-parent diff).

> Perhaps you even mean to say that 'first-parent' is the only one that
> makes sense?

I would not go _so_ far as to say that.

Before submitting the patch yesterday, I played a little with a few
modes, using the commit graph as per the new t3206 test case.

What that test case range-diffs looks like this:

 - * - changeA - changeB ------------
     \                     \          \
      \                      conflict  \
       \                   /            \
         rename - changeA - changeB ----- clean

In other words, the main branch modifies a file's contents twice, the
topic branch renames it first, then makes the same modifications.

The clean merge simply merges the topic, which makes the main branch
tree-same to the topic branch.

The conflicting merge misses the tip commit of the topic branch, i.e.
changeB on the renamed file. The result is tree-same to the clean merge
because changeB on the non-renamed file is already part of the main
branch.

Out of habit, I used `--diff-merge=first-parent` in the new test case,
whose output reiterates what I just said: both merges introduce the same
first-parent diff (renaming the file, no other changes):

	1:  1e6226b < -:  ------- s/12/B/
	2:  043e738 = 1:  6ddec15 merge

What about `separate`?


	2:  043e738 = 1:  6ddec15 merge
	1:  1e6226b ! 2:  6ddec15 s/12/B/
	    @@
	      ## Metadata ##
	    -Author: Thomas Rast <trast@xxxxxxxxxxx>
	    +Author: A U Thor <author@xxxxxxxxxxx>

	      ## Commit message ##
	    -    s/12/B/
	    +    merge

	      ## renamed-file ##
	     @@ renamed-file: A

This might look confusing at first because there are two merge diffs on
the right-hand side, but only one on the left-hand side. But that is all
good because the diff of the clean merge between merge head and merge is
empty and therefore not shown. And the `range-diff` demonstrates that the
conflicting merge had to recapitulate the tip commit of the `renamed-file`
topic branch.

The `combined` and `dense-combined` modes produce identical output to
`first-parent` in this instance, which is expected.

Now, let's use `remerge`, which should be illuminating with regards to
"evil merges" [*1*], as it shows what was done on top of the automatic
merge:

	1:  1e6226b < -:  ------- s/12/B/
	2:  043e738 < -:  ------- merge
	-:  ------- > 1:  6ddec15 merge

Huh. That is a bit surprising at first. Let's use a higher creation factor
to ask `range-diff` to match correspondences more loosely (I used 999,
which kinda forces even non-sensical pairings):

	1:  1e6226b ! 1:  6ddec15 s/12/B/
	    @@
	      ## Metadata ##
	    -Author: Thomas Rast <trast@xxxxxxxxxxx>
	    +Author: A U Thor <author@xxxxxxxxxxx>

	      ## Commit message ##
	    -    s/12/B/
	    +    merge

	      ## renamed-file ##
	    + remerge CONFLICT (content): Merge conflict in renamed-file
	    + index 3f1c0da..25ddf7a 100644
	    + --- renamed-file
	    + +++ renamed-file
	     @@ renamed-file: A
	      9
	      10
	      B
	    +-<<<<<<< 2901f77 (s/12/B/):file
	    + B
	    +-=======
	     -12
	    -+B
	    +->>>>>>> 3ce7af6 (s/11/B/):renamed-file
	      13
	      14
	      15
	2:  043e738 < -:  ------- merge

(Note that if you repeat this experiment, you will see something different
due to a bug in `--remerge-diff` that I will fix separately.)

Hold on, that looks wrong! It is actually not wrong, `range-diff` just
finds the changeB a better pairing than the merge commit with an empty
diff...

So: This mode can be useful to detect where merge conflicts had to be
resolved.

In short, I think that in `range-diff`'s context all of these modes have
merit, depending on a particular use case. It's just that `first-parent`
is probably the most natural to use in the common case.

> At a minimum, we should mention that it is the one that makes most sense
> (if that is the case).

Yeah, I'll go with that and drop saying anything about other modes.

Ciao,
Johannes

Footnote *1*: We should really find a much better name for this than "evil
merge". There is nothing evil about me having to add `#include
"environment.h"` in v2.45.1^, for example. It was necessary so that the
code would build. Tedious, yes, but not evil.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux