Thomas Gummerer <t.gummerer@xxxxxxxxx> writes: > +Conflict normalization > +---------------------- > + > +To try and re-do a conflict resolution, even when different merge > +strategies are used, 'rerere' computes a conflict ID for each > +conflict in the file. > + > +This is done by discarding the common ancestor version in the > +diff3-style, and re-ordering the two sides of the conflict, in > +alphabetic order. s/discarding.*-style/normalising the conflicted section to 'merge' style/ The motivation behind the normalization should probably be given upfront in the first paragraph. It is to ensure the recorded resolutions can be looked up from the rerere database for application, even when branches are merged in different order. I am not sure what you meant by even when different merge stratagies are used; I'd drop that if I were writing the paragraph. > +Using this technique a conflict that looks as follows when for example > +'master' was merged into a topic branch: > + > + <<<<<<< HEAD > + foo > + ======= > + bar > + >>>>>>> master > + > +and the opposite way when the topic branch is merged into 'master': > + > + <<<<<<< HEAD > + bar > + ======= > + foo > + >>>>>>> topic > + > +can be recognized as the same conflict, and can automatically be > +re-resolved by 'rerere', as the SHA-1 sum of the two conflicts would > +be calculated from 'bar<NUL>foo<NUL>' in both cases. You earlier talked about normalizing and reordering, but did not talk about "concatenate both with NUL in between and hash", so the explanation in the last two lines are not quite understandable by mere mortals, even though I know which part of the code you are referring to. When you talk about hasing, you may want to make sure the readers understand that the branch label on <<< and >>> lines are ignored. > +If there are multiple conflicts in one file, they are all appended to > +one another, both in the 'preimage' file as well as in the conflict > +ID. In case it was not clear (and I do not think it is to those who only read your description and haven't thought things through themselves), this concatenation is why the normalization by reordering is helpful. Imagine that a common ancestor had a file with a line with string "A" on it (I'll call such a line "line A" for brevity in the following) in its early part, and line X in its late part. And then you fork four branches that do these things: - AB: changes A to B - AC: changes A to C - XY: changes X to Y - XZ: changes X to Z Now, forking a branch ABAC off of branch AB and then merging AC into it, and forking a branch ACAB off of branch AC and then merging AB into it, would yield the conflict in a different order. The former would say "A became B or C, what now?" while the latter would say "A became C or B, what now?" But the act of merging AC into ABAC and resolving the conflict to leave line D means that you declare: After examining what branches AB and AC did, I believe that making line A into line D is the best thing to do that is compatible with what AB and AC wanted to do. So the conflict we would see when merging AB into ACAB should be resolved the same way---it is the resolution that is in line with that declaration. Imagine that similarly you had previously forked branch XYXZ from XY, merged XZ into it, and resolved "X became Y or Z" into "X became W". Now, if you forked a branch ABXY from AB and then merged XY, then ABXY would have line B in its early part and line Y in its later part. Such a merge would be quite clean. We can construct 4 combinations using these four branches ((AB, AC) x (XY, XZ)). Merging ABXY and ACXZ would make "an early A became B or C, a late X became Y or Z" conflict, while merging ACXY and ABXZ would make "an early A became C or B, a late X became Y or Z". We can see there are 4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X"). By sorting, we can give the conflict its canonical name, namely, "an early part became B or C, a late part becames X or Y", and whenever any of these four patterns appear, we can get to the same conflict and resolution that we saw earlier. Without the sorting, we will have to somehow find a previous resolution from combinatorial explosion ;-) These days post ec34a8b1 ("Merge branch 'jc/rerere-multi'", 2016-05-23), the conflict ID can safely collide, i.e. hash collisions that drops completely different conflicts and their resolutions into the same .git/rr-cache/$id directory will not interfere with proper operation of the system, thanks to that rerere-multi topic that allows us to store multiple preimage conflicts that happens to share the same conflict ID with their corresponding postimage resolutions. In theory, we *should* be able to stub out the SHA-1 computation and give every conflict the same ID and rerere should still operate correctly, even though I haven't tried it yet myself.