Re: [PATCH 07/48] t6039: Ensure rename/rename conflicts leave index and workdir in sane state

Elijah Newren <newren@xxxxxxxxx> · Mon, 8 Aug 2011 11:59:19 -0600

On Mon, Jul 18, 2011 at 5:40 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> +# Test for all kinds of things that can go wrong with rename/rename (2to1):
>> +#   Commit A: new files: a & b
>> +#   Commit B: rename a->c, modify b
>> +#   Commit C: rename b->c, modify a
>> +#
>> +# Merging of B & C should NOT be clean.  Questions:
>> +#   * Both a & b should be removed by the merge; are they?
>> +#   * The two c's should contain modifications to a & b; do they?
>> +#   * The index should contain two files, both for c; does it?
>> +#   * The working copy should have two files, both of form c~<unique>; does it?
>> +#   * Nothing else should be present.  Is anything?
>
> What is the most useful thing to leave in the index and in the working
> tree for the person who needs to resolve such a merge using the working
> tree, starting from B and merging C? The above "Questions" lists what the
> current code might try to do but I am not sure if it is really useful. For
> example, in the index, you would have to stuff two stage #1 entries ("a"
> from A and "b" from A) for path "c", with stage #2 ("c" from B) and stage
> #3 ("c" from C) entries, and represent what B tried to do to "a" (in the
> above you said "rename a->c" but it does not have to be a rename without
> content change) and what C tried to do to "b" in the half-conflicted
> result that is in a single file "c". Because the result are totally
> unrelated files (one side wants a variant of original "a" there, the other
> side wants a variant of "b"), such a half-merge result is totally useless
> to help the person to come up with anything.
>
> Also renaming "c" to "c~<unique>", if they do not have corresponding
> entries in the index to let you check with "git diff", would make the
> result _harder_ to use, not easier. So if you are going to rename "c" to
> "c-B" and "c-C", at least it would make much more sense to have in the
> index:
>
>  - "c-B", with stage #1 ("a" from A), stage #2 ("c" from B) and stage #3
>   ("a" from C);
>  - "c-C", with stage #1 ("b" from A), stage #2 ("b" from B) and stage #3
>   ("c" from C); and
>  - No "a" nor "b" in the index nor in the working tree.
>
> no?
>
> That way, you could run "git diff" to get what happened to the two
> variants of "a" and "b" at the content level, and decide to clean things
> up with:
>
>    $ git diff ;# view content level merge
>    $ edit c-B c-C; git add c-B c-C
>    $ git mv c-B c-some-saner-name
>    $ git mv c-C c-another-saner-name
>    $ edit other files that refer to c like Makefile
>    $ git commit

That sounds very interesting.  My first thought is that you'd have to
do the same thing in the case of a D/F conflict, but I notice that
later in the patch series you asked for exactly that.  The idea
certainly has potential, though I might need to think it through a
little more.

> To take it one step further to the extreme, it might give us a more
> reasonable and useful conflicted state if we deliberately dropped some
> information instead in a case like this, e.g.:
>
>  - We may want to have "a" at stage #1 (from A) in the index;
>  - No "a" remains in the working tree;
>  - "b" at stage #1 (from A), stage #2 (from B) and stage #3 ("c" from C);
>  - "b" in the working tree a conflicted-merge of the above three;
>  - "c" at stage #1 ("a" from A), stage #2 (from B), and stage #3 ("a" from
>   C); and
>  - "c" in the working tree a conflicted-merge of the above three.
>
> Note that unlike the current merge-recursive that tries to come up with a
> temporary pathname to store both versions of C, this would ignore "mv b c"
> on the A->C branch, and make the conflicted tentative merge asymmetric
> (merging B into C and merging C into B would give different conflicts),
> but I suspect that the asymmetry may not hurt us.
>
> Whether the merger wants to keep "c" that was derived from "a" (in line
> with the HEAD) or "c" that was derived from "b" (in line with MERGE_HEAD),
> if the result were to keep both files in some shape, the content level
> edit, renaming of at least one side, and adjusting other files that refer
> to it, are all required anyway, e.g.
>
>    $ git diff ;# view content level merge
>    $ edit b c; git add b c
>    $ edit other files that refer to c line Makefile (the content C's
>      change wants is now in "b").
>    $ git commit
>
> would be a way to pick "c" as "c-some-saner-name" and "b" as
> "c-another-saner-name" in the previous workflow, but needs much less
> typing. The complexity of the workflow would be the same if the final
> resolution is to take what one side did and dropping the other's work,
> I think.

I think the asymmetry is slightly confusing and could become
problematic.  If we decide to turn on break detection, then we would
hit problems in a scenario such as:

Commit A: files a, b are present
Commit B: rename a->c, add an unrelated a
Commit C: rename b->c, add an unrelated b

In that case, "undoing" the rename as you suggest gives us a conflict
with other content that was added at the path.

Also, as mentioned above, D/F conflicts hit similar cases where we
need to rename the path in the working copy.  If we try to handle them
similarly to how you are suggesting for the rename/rename(2to1) case,
we can do so in some cases but hit problems in others.  For example,
take a rename/delete conflict with D/F conflicts:

Commit A: file a is present
Commit B: rename a -> df, possibly also modifying it
Commit C: delete a, add files a/foo and df/bar

We can't use either the path 'df' or 'a' for recording the content.  I
think the rules become too confusing for "selectively undoing renames"
and it'd be easier to just use <bad-dest-path>~<unique> in all cases.
However, I think your suggestion to move index stage information to
these uniquely renamed paths could probably work and may be useful.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html