On Sun, Nov 10, 2024 at 4:37 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Johannes Schindelin via GitGitGadget" <gitgitgadget@xxxxxxxxx> > writes: > [...] > > +--diff-merges=<format>:: > > + Instead of ignoring merge commits, generate diffs for them using the > > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], > > + and include them in the comparison. > > ++ > > +Note: In the common case, the `first-parent` mode will be the most natural one > > +to use, as it is consistent with the idea that a merge is kind of a "meta > > +patch", comprising all the merged commits' patches into a single one. > > I'll let you and Elijah figure out the best phrasing around here, > including whether we recommend only first-parent or give other > options. I'd probably suggest: Note: When trying to see how users adjusted the contents of a merge commit, `remerge-diff` is a natural choice. When viewing merges as a "meta patch", comprising a squash of all the merged commits' patches into a single one (similar to `git log --first-parent -p`), `first-parent` is a natural choice. > For me personally, I find that use of `first-parent` here is > consistent with the mindset of those users who prefer to read "git > log --first-parent -p", and to a smaller degree, those who use > squash merges. To them, a merge is merely bringing in lots of > changes into the mainline as a single unit, and other than that > these lots of changes are broken out for finer grained inspection > if/when they wanted to, they prefer to treat the changes brought in > by merges just like a single-parent commit. And from that point of > view, (1) reordering the changes inside the side branch is > immaterial for the purpose of talking about the result, the net > effect the merged topic makes to the mainline, and (2) dropping or > adding new changes to the side branch is treated just like a newer > iteration of a single parent commit having fewer or more changes. > So it is very natural that first-parent helps to make the world view > very simplistic, and more readable range-diff is all about how you > can present the two iterations in a simple and flattened form. > > There _may_ need a tweak of the matching algorithm to allow the > "same" merge on both sides to match, even if they diverge a lot, > though. A range-diff that pairs a merge in the previous iteration > with a single patch in the updated iteration may be hard to read. Sounds like you are arguing that there is an angle/usecase from which `first-parent` makes sense, namely the viewpoint that "a merge is merely bringing in lots of changes into the mainline as a single unit" as you put it. What was surprising to me, though, is that it's a completely different usecase than the one that was brought up in the commit message for this feature, namely "so-called 'evil merges' are sometimes necessary and need to be reviewed too". If you want to review "evilness" or the differences between the changes that `git merge` would make and what got recorded in the commit, then `first-parent` is not that great an approximation of what you are looking for, BUT it happens to work beautifully in common cases where you are merely transplanting commits due to the fact that the huge amounts of extras being put into the diff happen to be equal on the two sides, so they cancel out and leave you with the "evilness" you are interested in. But if you stray from that common case (by e.g. inserting, removing, or modifying commits while rebasing) and are still interested in "evilness", your approximation quickly drops from accurately pulling out just the bits you are interested in towards being completely swamped by the noise. My concern with the commit message and patch as they stand, is that I feel the `first-parent` mode is being recommended as a solution for a certain usecase where it is really only an approximation that luckily is highly accurate in special but common scenarios while being very inaccurate in others. I think that will lead to users getting negatively surprised when they move outside those common cases.