Re: [PATCH v2 0/8] Add a new --remerge-diff capability to show & log

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 27 2021, Elijah Newren wrote:

> On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason
> <avarab@xxxxxxxxx> wrote:
>>
>> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > === FURTHER BACKGROUND (original cover letter material) ==
>> >
>> > Here are some example commits you can try this out on (with git show
>> > --remerge-diff $COMMIT):
>> >
>> >  * git.git conflicted merge: 07601b5b36
>> >  * git.git non-conflicted change: bf04590ecd
>> >  * linux.git conflicted merge: eab3540562fb
>> >  * linux.git non-conflicted change: 223cea6a4f05
>> >
>> > Many more can be found by just running git log --merges --remerge-diff in
>> > your repository of choice and searching for diffs (most merges tend to be
>> > clean and unmodified and thus produce no diff but a search of '^diff' in the
>> > log output tends to find the examples nicely).
>> >
>> > Some basic high level details about this new option:
>> >
>> >  * This option is most naturally compared to --cc, though the output seems
>> >    to be much more understandable to most users than --cc output.
>> >  * Since merges are often clean and unmodified, this new option results in
>> >    an empty diff for most merges.
>> >  * This new option shows things like the removal of conflict markers, which
>> >    hunks users picked from the various conflicted sides to keep or remove,
>> >    and shows changes made outside of conflict markers (which might reflect
>> >    changes needed to resolve semantic conflicts or cleanups of e.g.
>> >    compilation warnings or other additional changes an integrator felt
>> >    belonged in the merged result).
>> >  * This new option does not (currently) work for octopus merges, since
>> >    merge-ort is specific to two-parent merges[1].
>> >  * This option will not work on a read-only or full filesystem[2].
>> >  * We discussed this capability at Git Merge 2020, and one of the
>> >    suggestions was doing a periodic git gc --auto during the operation (due
>> >    to potential new blobs and trees created during the operation). I found a
>> >    way to avoid that; see [2].
>> >  * This option is faster than you'd probably expect; it handles 33.5 merge
>> >    commits per second in linux.git on my computer; see below.
>> >
>> > In regards to the performance point above, the timing for running the
>> > following command:
>> >
>> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l
>>
>> I've been trying to come up with some other useful recipies for this new
>> option (which is already very useful, thanks!)
>
> I'm glad you like it.  :-)
>
>> Some of these (if correct) are suggestions for incorporating into the
>> (now rather sparse) documentation. I.e. walking users through how to use
>> this, and how (if at all) it combines with other options.
>>
>> I wanted to find all merges between "master".."seen" for which Junio's
>> had to resolve a conflict, a naïve version is:
>>
>>     $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
>>     [...]
>
> I think the naive version is
>   $ git log --remerge-diff --min-parents=2 origin/master..origin/seen
>   <search for "^diff" using your pager's search functionality>
>
> Where the "--min-parents=2 origin/master..origin/seen" comes from your
> problem description ("find all merges between master..seen").
>
> You can add --oneline to format it, though it's an orthogonal concern.
> Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p.
>
>> But I found that this new option nicely integrates with --diff-filter,
>> i.e. we'll end up showing a diff, and the diff machinery allows you to
>> to filter on it.
>>
>> It seems to me like all the diffs you show fall under "M", so for
>
> Yes, the diffs I happened to pick all fell under "M", but by no means
> should you rely on that happening for all merges in history.  For
> example, make a new merge commit, then add a completely new file (or
> delete a file, or rename a file, or copy a file, or change its
> mode/type), stage the new/deleted/renamed/copied/changed file, and run
> "git commit --amend".
>
> So, although --diff-filter=M can be interesting, I would not rely on it.

*Nod* I hadn't thought of those (in retrospect, rather obvious) cases.

>> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
>> output is the same as the above):
>>
>>     $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen
>>     95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
>>     26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
>>     e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
>>     aeada898196 Merge branch 'js/branch-track-inherit' into seen
>>     4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
>>     337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
>>     261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>>     1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
>>     7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
>>     eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
>>     ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
>>     751773fc38b Merge branch 'es/test-chain-lint' into jch
>>     ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix
>>
>> However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
>> we'll oddly show this with "-p":
>>
>>     9af51fd1d0d Sync with 'master'
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>>     diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
>>     CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
>>     index 996c93329c6..33860d38290 100755
>>     --- a/t/t4126-apply-empty.sh
>>     +++ b/t/t4126-apply-empty.sh
>>     [...]
>>
>> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
>> the second d6f56f3248e, which shows the sort of conflict I'd expect. The
>> two-line "diff" of:
>>
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>
>> Shows up with -p --remerge-diff, not a mere -p. I also tried the other
>> --diff-merges=* options, that behavior is new in
>> --diff-merges=remerge. Is this a bug?
>
> Ugh, this is related to my comment elsewhere that conflicts from inner
> merges are not nicely differentiated.  If I also apply my other series
> (which has not yet been submitted), this instead appears as follows:
>
> $ git show --oneline --remerge-diff 9af51fd1d0d
> 9af51fd1d0 Sync with 'master'
> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>   From inner merge:  CONFLICT (content): Merge conflict in t/lib-gpg.sh
>
> and the addition of the "From inner merge: " text makes it clearer why
> that line appears.  This is an interesting case where a conflict
> notice _only_ appears in the inner merge (i.e. the merge of merge
> bases), which means that both sides on the outer merge changed the
> relevant portion of the file in the same way, so the outer merge had
> no conflict.
>
> However, instead of trying to differentiate messages from inner
> merges, I think for --remerge-diff's purposes we should just drop all
> notices that come from the inner merges.  Those conflict notices might
> be helpful when initially resolving a merge, but at the --remerge-diff
> level, they're more likely to be distracting than helpful.

Thanks. I see that v3 addresses this. Will try it out...

>> My local build also has a --pickaxe-patch option. It's something I
>> submitted on-list before[1] and have been meaning to re-roll.
>>
>> I'm discussing it here because it skips the stripping of the "+ " and "-
>> " prefixes under -G<regex> and allows you to search through the -U<n>
>> context. With that I'm able to do:
>>
>>     git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen
>>
>> I.e. on top of the above filter only show those diffs that have
>> additions. FAICT the conflicting diffs where the committer of the merge
>> conflict picked one side or the other will only have "-" lines".
>>
>> So those diffs that have additions look to be those where the person
>> doing the merge needed to combine the two.
>>
>> Well, usually. E.g. 26c4c09dd34 (Merge branch
>> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
>> range shows that isn't strictly true. Most such deletion-only diffs are
>> less interesting in picking one side or the other of the conflict, but
>> that one combines the two:
>>
>>     -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
>>                             warning(_("protocol does not support --negotiate-only, exiting"));
>>     -                       return 1;
>>     -=======
>>     -                       warning(_("Protocol does not support --negotiate-only, exiting."));
>>                             result = 1;
>>                             goto cleanup;
>>     ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)
>>
>> Which I guess is partially commentary and partially a request (either
>> for this series, or some follow-up) for something like a
>> --remerge-diff-filter option. I.e. it would be very useful to be able to
>> filter on some combination of:
>>
>>  * Which side(s) of the conflict(s) were picked, or a combination?
>>  * Is there "new work" in the diff to resolve the conflict?
>>    AFIACT this will always mean we'll have "+ " lines.
>
> Do any of the following count as "new work"? :
>
>   * the deletion of a file (perhaps one that had no conflict but was
> deleted anyway)
>   * mode changes (again, perhaps on files that had no conflict)
>   * renames of files/directories?
>
> If so, searching for "^+" lines might be insufficient, but it depends
> on what you mean by new work.

Yes, I think at least for the use-cases I had in mind the useful thing
is knowing which state a merge commit is in:

 A: "Vanilla" merge (i.e. no --remerge-diff output)
 B: non-"vanilla" merge" (i.e. ew have --remerge-diff output)

What I'm requesting/musing about here (which is very much in the
category of stuff we can/should do later, and shouldn't prevent this
"good enough for now" series going in) is being able to disambiguate
those "B" cases.

I.e. being able to see if their/ours side "won", some mixture of the two
etc.

>> Or maybe that's not useful at all, and just -G<rx> (maybe combined with
>> my --pickaxe-patch) will cover it?
>
> I'd rather wait until we have a good idea of the potential range of
> usecases before adding a filter.  (And I think for now, the -G and
> --pickaxe-patch are probably good enough for this usecase.)  These
> particular usecases you point out are interesting; thanks for
> detailing them.  Here's some others to consider:
>
>   * Finding out when text was added or removed: `git log
> --remerge-diff -S<text>` (note that with only -p instead of
> --remerge-diff, that command will annoyingly misses cases where a
> merge introduced or removed the text)
>   * Finding out how a merge differed from one run with some
> non-default options (e.g. `git show --remerge-diff -Xours` or `git
> show --remerge-diff -Xno-space-change`; although show doesn't take -X
> options so this is just an idea at this point)
>   * Finding out how a merge would have differed had it been run with
> different options (so instead of comparing a remerge to the merge
> recorded in history, compare one remerge with default options with a
> different merge that uses e.g. -Xno-space-change)
>
> Also, I've got a follow-up series that also introduces a
> --remerge-diff-only flag which:
>   * For single parent commits that cannot be identified as a revert or
> cherry-pick, do not show a diff.
>   * For single parent commits that can be identified as a revert or
> cherry-pick, instead of showing a diff against the parent of the
> commit, redo the revert or cherry-pick in memory and show a diff
> against that.
>   * For merge commits, act the same as --remerge-diff

*nod*




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux