On Mon, Dec 27 2021, Elijah Newren wrote: > On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason > <avarab@xxxxxxxxx> wrote: >> >> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote: >> >> > === FURTHER BACKGROUND (original cover letter material) == >> > >> > Here are some example commits you can try this out on (with git show >> > --remerge-diff $COMMIT): >> > >> > * git.git conflicted merge: 07601b5b36 >> > * git.git non-conflicted change: bf04590ecd >> > * linux.git conflicted merge: eab3540562fb >> > * linux.git non-conflicted change: 223cea6a4f05 >> > >> > Many more can be found by just running git log --merges --remerge-diff in >> > your repository of choice and searching for diffs (most merges tend to be >> > clean and unmodified and thus produce no diff but a search of '^diff' in the >> > log output tends to find the examples nicely). >> > >> > Some basic high level details about this new option: >> > >> > * This option is most naturally compared to --cc, though the output seems >> > to be much more understandable to most users than --cc output. >> > * Since merges are often clean and unmodified, this new option results in >> > an empty diff for most merges. >> > * This new option shows things like the removal of conflict markers, which >> > hunks users picked from the various conflicted sides to keep or remove, >> > and shows changes made outside of conflict markers (which might reflect >> > changes needed to resolve semantic conflicts or cleanups of e.g. >> > compilation warnings or other additional changes an integrator felt >> > belonged in the merged result). >> > * This new option does not (currently) work for octopus merges, since >> > merge-ort is specific to two-parent merges[1]. >> > * This option will not work on a read-only or full filesystem[2]. >> > * We discussed this capability at Git Merge 2020, and one of the >> > suggestions was doing a periodic git gc --auto during the operation (due >> > to potential new blobs and trees created during the operation). I found a >> > way to avoid that; see [2]. >> > * This option is faster than you'd probably expect; it handles 33.5 merge >> > commits per second in linux.git on my computer; see below. >> > >> > In regards to the performance point above, the timing for running the >> > following command: >> > >> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l >> >> I've been trying to come up with some other useful recipies for this new >> option (which is already very useful, thanks!) > > I'm glad you like it. :-) > >> Some of these (if correct) are suggestions for incorporating into the >> (now rather sparse) documentation. I.e. walking users through how to use >> this, and how (if at all) it combines with other options. >> >> I wanted to find all merges between "master".."seen" for which Junio's >> had to resolve a conflict, a naïve version is: >> >> $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge >> [...] > > I think the naive version is > $ git log --remerge-diff --min-parents=2 origin/master..origin/seen > <search for "^diff" using your pager's search functionality> > > Where the "--min-parents=2 origin/master..origin/seen" comes from your > problem description ("find all merges between master..seen"). > > You can add --oneline to format it, though it's an orthogonal concern. > Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p. > >> But I found that this new option nicely integrates with --diff-filter, >> i.e. we'll end up showing a diff, and the diff machinery allows you to >> to filter on it. >> >> It seems to me like all the diffs you show fall under "M", so for > > Yes, the diffs I happened to pick all fell under "M", but by no means > should you rely on that happening for all merges in history. For > example, make a new merge commit, then add a completely new file (or > delete a file, or rename a file, or copy a file, or change its > mode/type), stage the new/deleted/renamed/copied/changed file, and run > "git commit --amend". > > So, although --diff-filter=M can be interesting, I would not rely on it. *Nod* I hadn't thought of those (in retrospect, rather obvious) cases. >> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the >> output is the same as the above): >> >> $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen >> 95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen >> 26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen >> e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen >> aeada898196 Merge branch 'js/branch-track-inherit' into seen >> 4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen >> 337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen >> 261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen >> 1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen >> 7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen >> eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen >> ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch >> 751773fc38b Merge branch 'es/test-chain-lint' into jch >> ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix >> >> However for "origin/master..origin/next" (next = 510f9eba9a2 currently) >> we'll oddly show this with "-p": >> >> 9af51fd1d0d Sync with 'master' >> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh >> CONFLICT (content): Merge conflict in t/lib-gpg.sh >> d6f56f3248e Merge branch 'es/test-chain-lint' into next >> diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh >> CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh >> index 996c93329c6..33860d38290 100755 >> --- a/t/t4126-apply-empty.sh >> +++ b/t/t4126-apply-empty.sh >> [...] >> >> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not >> the second d6f56f3248e, which shows the sort of conflict I'd expect. The >> two-line "diff" of: >> >> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh >> CONFLICT (content): Merge conflict in t/lib-gpg.sh >> >> Shows up with -p --remerge-diff, not a mere -p. I also tried the other >> --diff-merges=* options, that behavior is new in >> --diff-merges=remerge. Is this a bug? > > Ugh, this is related to my comment elsewhere that conflicts from inner > merges are not nicely differentiated. If I also apply my other series > (which has not yet been submitted), this instead appears as follows: > > $ git show --oneline --remerge-diff 9af51fd1d0d > 9af51fd1d0 Sync with 'master' > diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh > From inner merge: CONFLICT (content): Merge conflict in t/lib-gpg.sh > > and the addition of the "From inner merge: " text makes it clearer why > that line appears. This is an interesting case where a conflict > notice _only_ appears in the inner merge (i.e. the merge of merge > bases), which means that both sides on the outer merge changed the > relevant portion of the file in the same way, so the outer merge had > no conflict. > > However, instead of trying to differentiate messages from inner > merges, I think for --remerge-diff's purposes we should just drop all > notices that come from the inner merges. Those conflict notices might > be helpful when initially resolving a merge, but at the --remerge-diff > level, they're more likely to be distracting than helpful. Thanks. I see that v3 addresses this. Will try it out... >> My local build also has a --pickaxe-patch option. It's something I >> submitted on-list before[1] and have been meaning to re-roll. >> >> I'm discussing it here because it skips the stripping of the "+ " and "- >> " prefixes under -G<regex> and allows you to search through the -U<n> >> context. With that I'm able to do: >> >> git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen >> >> I.e. on top of the above filter only show those diffs that have >> additions. FAICT the conflicting diffs where the committer of the merge >> conflict picked one side or the other will only have "-" lines". >> >> So those diffs that have additions look to be those where the person >> doing the merge needed to combine the two. >> >> Well, usually. E.g. 26c4c09dd34 (Merge branch >> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that >> range shows that isn't strictly true. Most such deletion-only diffs are >> less interesting in picking one side or the other of the conflict, but >> that one combines the two: >> >> -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen) >> warning(_("protocol does not support --negotiate-only, exiting")); >> - return 1; >> -======= >> - warning(_("Protocol does not support --negotiate-only, exiting.")); >> result = 1; >> goto cleanup; >> ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules) >> >> Which I guess is partially commentary and partially a request (either >> for this series, or some follow-up) for something like a >> --remerge-diff-filter option. I.e. it would be very useful to be able to >> filter on some combination of: >> >> * Which side(s) of the conflict(s) were picked, or a combination? >> * Is there "new work" in the diff to resolve the conflict? >> AFIACT this will always mean we'll have "+ " lines. > > Do any of the following count as "new work"? : > > * the deletion of a file (perhaps one that had no conflict but was > deleted anyway) > * mode changes (again, perhaps on files that had no conflict) > * renames of files/directories? > > If so, searching for "^+" lines might be insufficient, but it depends > on what you mean by new work. Yes, I think at least for the use-cases I had in mind the useful thing is knowing which state a merge commit is in: A: "Vanilla" merge (i.e. no --remerge-diff output) B: non-"vanilla" merge" (i.e. ew have --remerge-diff output) What I'm requesting/musing about here (which is very much in the category of stuff we can/should do later, and shouldn't prevent this "good enough for now" series going in) is being able to disambiguate those "B" cases. I.e. being able to see if their/ours side "won", some mixture of the two etc. >> Or maybe that's not useful at all, and just -G<rx> (maybe combined with >> my --pickaxe-patch) will cover it? > > I'd rather wait until we have a good idea of the potential range of > usecases before adding a filter. (And I think for now, the -G and > --pickaxe-patch are probably good enough for this usecase.) These > particular usecases you point out are interesting; thanks for > detailing them. Here's some others to consider: > > * Finding out when text was added or removed: `git log > --remerge-diff -S<text>` (note that with only -p instead of > --remerge-diff, that command will annoyingly misses cases where a > merge introduced or removed the text) > * Finding out how a merge differed from one run with some > non-default options (e.g. `git show --remerge-diff -Xours` or `git > show --remerge-diff -Xno-space-change`; although show doesn't take -X > options so this is just an idea at this point) > * Finding out how a merge would have differed had it been run with > different options (so instead of comparing a remerge to the merge > recorded in history, compare one remerge with default options with a > different merge that uses e.g. -Xno-space-change) > > Also, I've got a follow-up series that also introduces a > --remerge-diff-only flag which: > * For single parent commits that cannot be identified as a revert or > cherry-pick, do not show a diff. > * For single parent commits that can be identified as a revert or > cherry-pick, instead of showing a diff against the parent of the > commit, redo the revert or cherry-pick in memory and show a diff > against that. > * For merge commits, act the same as --remerge-diff *nod*