Hi Hannes On Fri, 8 Nov 2024, Johannes Sixt wrote: > Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget: > > +--diff-merges=<format>:: > > + Instead of ignoring merge commits, generate diffs for them using the > > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], > > + and include them in the comparison. > > ++ > > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in > > +the context of the `range-diff` command than other formats, so choose wisely! > > Can we please be a bit more specific which options are usable or which > are not usable? There are a couple of reasons: - Most importantly: I had not even studied that question in depth when I wrote the original patch, and not even when I submitted the first iteration. In my reviews of branch thickets, I used `--diff-merges=1` and it served my needs well. Therefore I can speak to this mode, but not really to the other modes. - Which options are usable or not may lack a universally-correct answer. One person might find `--diff-merges=separate` confusing while the next person may find it quite useful. - If the documentation points out an unsuitable mode, then an obvious follow-up wish would be to reject this mode in the command, with an error message. Some users might, however, need precisely this mode, so I am loathe to do that. Maybe this compromise: Change the "Note:" to recommend the `first-parent` mode, with a bit more explanation why that is so (it is consistent with the idea that a merge is kind of a "meta patch", comprising all the merged commits' patches, which is equivalent to the first-parent diff). > Perhaps you even mean to say that 'first-parent' is the only one that > makes sense? I would not go _so_ far as to say that. Before submitting the patch yesterday, I played a little with a few modes, using the commit graph as per the new t3206 test case. What that test case range-diffs looks like this: - * - changeA - changeB ------------ \ \ \ \ conflict \ \ / \ rename - changeA - changeB ----- clean In other words, the main branch modifies a file's contents twice, the topic branch renames it first, then makes the same modifications. The clean merge simply merges the topic, which makes the main branch tree-same to the topic branch. The conflicting merge misses the tip commit of the topic branch, i.e. changeB on the renamed file. The result is tree-same to the clean merge because changeB on the non-renamed file is already part of the main branch. Out of habit, I used `--diff-merge=first-parent` in the new test case, whose output reiterates what I just said: both merges introduce the same first-parent diff (renaming the file, no other changes): 1: 1e6226b < -: ------- s/12/B/ 2: 043e738 = 1: 6ddec15 merge What about `separate`? 2: 043e738 = 1: 6ddec15 merge 1: 1e6226b ! 2: 6ddec15 s/12/B/ @@ ## Metadata ## -Author: Thomas Rast <trast@xxxxxxxxxxx> +Author: A U Thor <author@xxxxxxxxxxx> ## Commit message ## - s/12/B/ + merge ## renamed-file ## @@ renamed-file: A This might look confusing at first because there are two merge diffs on the right-hand side, but only one on the left-hand side. But that is all good because the diff of the clean merge between merge head and merge is empty and therefore not shown. And the `range-diff` demonstrates that the conflicting merge had to recapitulate the tip commit of the `renamed-file` topic branch. The `combined` and `dense-combined` modes produce identical output to `first-parent` in this instance, which is expected. Now, let's use `remerge`, which should be illuminating with regards to "evil merges" [*1*], as it shows what was done on top of the automatic merge: 1: 1e6226b < -: ------- s/12/B/ 2: 043e738 < -: ------- merge -: ------- > 1: 6ddec15 merge Huh. That is a bit surprising at first. Let's use a higher creation factor to ask `range-diff` to match correspondences more loosely (I used 999, which kinda forces even non-sensical pairings): 1: 1e6226b ! 1: 6ddec15 s/12/B/ @@ ## Metadata ## -Author: Thomas Rast <trast@xxxxxxxxxxx> +Author: A U Thor <author@xxxxxxxxxxx> ## Commit message ## - s/12/B/ + merge ## renamed-file ## + remerge CONFLICT (content): Merge conflict in renamed-file + index 3f1c0da..25ddf7a 100644 + --- renamed-file + +++ renamed-file @@ renamed-file: A 9 10 B +-<<<<<<< 2901f77 (s/12/B/):file + B +-======= -12 -+B +->>>>>>> 3ce7af6 (s/11/B/):renamed-file 13 14 15 2: 043e738 < -: ------- merge (Note that if you repeat this experiment, you will see something different due to a bug in `--remerge-diff` that I will fix separately.) Hold on, that looks wrong! It is actually not wrong, `range-diff` just finds the changeB a better pairing than the merge commit with an empty diff... So: This mode can be useful to detect where merge conflicts had to be resolved. In short, I think that in `range-diff`'s context all of these modes have merit, depending on a particular use case. It's just that `first-parent` is probably the most natural to use in the common case. > At a minimum, we should mention that it is the one that makes most sense > (if that is the case). Yeah, I'll go with that and drop saying anything about other modes. Ciao, Johannes Footnote *1*: We should really find a much better name for this than "evil merge". There is nothing evil about me having to add `#include "environment.h"` in v2.45.1^, for example. It was necessary so that the code would build. Tedious, yes, but not evil.