Re: Bug in merge-ort (rename detection can have collisions?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 7, 2022 at 5:11 PM Glen Choo <chooglen@xxxxxxxxxx> wrote:
>
> (I'm not 100% what the bug _is_, only that there is one.)
>
> = Report
>
> At $DAYJOB, there was a report that "git merge" was failing on certain
> branches. Fortunately, the repo is publicly accessible, so I can share
> the full reproduction recipe:
>
>   git clone https://android.googlesource.com/platform/external/tensorflow tensorflow &&
>   cd tensorflow &&
>   git merge origin/upstream-master # HEAD is at origin/master
>
> This gives:
>
>   Performing inexact rename detection: 100% (4371280/4371280), done.
>   Performing inexact rename detection: 100% (12529218/12529218), done.
>   Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410.
>
> This bug seems specific to merge-ort; "git merge -s recursive
> origin/upstream-master" seems to work as expected.
>
> In case the branches have changed since then, here are the commit ids:
>
>   $ git rev-parse origin/master
>   68e55281824e8a79fa67e1a3061f39bd4c4b2e57
>   $ git rev-parse origin/upstream-master
>   0be5bb09aeeff3a6825842326fadc8159a5553ab
>   $ git merge-base 68e55281824e8a79fa67e1a3061f39bd4c4b2e57 0be5bb09aeeff3a6825842326fadc8159a5553ab
>   8e819019081f39d83df42baba4acfced3abf3f90
>
> = Interesting info
>
> I don't understand the merge-ort code enough to understand what's going
> on, but I was able to find some (hopefully helpful) details. I added
> this log line just above the offending assert() call:
>
>         trace2_printf("0 %s, 1 %s, 2 %s, fm %d, dm %d", ci->pathnames[0],
>     ci->pathnames[1], ci->pathnames[2], ci->filemask, ci->dirmask);
>
> Here are the lines I thought were suspicious:
>
>   0 <path1>, 1 <path1>, 2 <path1>, fm 2, dm 0
>   [...]
>   0 <path2>, 1 <path1>, 2 <path2>, fm 6, dm 0 # this is the last line
>
> Notice that the last line detected a rename from <path2> to <path1>, but
> we already saw <path1> earlier.
>
> IIUC "(ci->filemask == 2 || ci->filemask == 4)" can be read as "the path
> either exists on only the left side or only the right side of the
> merge", so ci->filemask == 6 should mean "the path exists on both sides
> of the merge"?
>
> "-s recursive" seems to handle the rename just fine (it picks <path2>
> IIRC).
>
> I also dug into each commit to see which paths were present:
>
>   head="origin/master"
>   other="origin/upstream-master"
>   merge_base="$(git merge-base origin/master origin/upstream-master)"
>   path1="tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb"
>   path2="tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb"
>
>   git rev-parse "$head:$path1" # (exists)
>   git rev-parse "$head:$path2" # (doesn't exist)
>
>   git rev-parse "$other:$path1" # (doesn't exist)
>   git rev-parse "$other:$path2" # (exists)
>
>   git rev-parse "$merge_base:$path1" # (doesn't exist)
>   git rev-parse "$merge_base:$path2" # (doesn't exist)
>
> i.e. both files are new and are renames of each other. I haven't tried
> using this property to create a minimally-reproducing recipe though.

Thanks for the detailed report; very cool.  Interestingly, if you
reverse the direction of the merge (checkout origin/upstream-master
and merge origin/master) then you get a different error:

    error: cache entry has null sha1:
tensorflow/lite/g3doc/examples/convert/metadata_writer_tutorial.ipynb
    fatal: unable to write .git/index

This merge has a lot of stuff going on, including some big directory
renames, new files, loads of conflicts, etc.  I think the relevant
bits are the following:

merge base -> side1:
    Rename tensorflow/lite/g3doc/models/ -> tensorflow/lite/g3doc/examples
    Add tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb

merge base -> side2:
    Rename tensorflow/lite/g3doc/convert/ ->
tensorflow/lite/g3doc/models/convert/
    Add tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb

So the combination of the above means:
    side1:tensorflow/lite/g3doc/convert/metadata_writer_tutorial.ipynb
would be affected by the side2 directory rename to become
    tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb
which would match the name of the file on side2, BUT the side2 version of
that file, i.e.:
    side2:tensorflow/lite/g3doc/models/convert/metadata_writer_tutorial.ipynb
would be affected by the side1 directory rename to become
    tensorflow/lite/g3doc/examples/convert/metadata_writer_tutorial.ipynb
and what becomes of the side1 version of that file?  It gets messy...


I have a small reproduction recipe (using the style from t6423 to explain it):

#   Commit O: sub1/file,                 sub2/other
#   Commit A: sub3/file,                 sub2/{other, new_add_add_file_1}
#   Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2}
#
#   In words:
#     A: sub1/ -> sub3/, add sub2/new_add_add_file_1
#     B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2

This small reproduction recipe triggers the same assertion-failure you
reported; further, when the merge direction is reversed this testcase
shows the same alternative error I showed above for your bigger
testcase.  However, interestingly, this simple testcase also triggers
those same null-sha1 and unable-to-write-.git/index errors in
merge-recursive, regardless of the direction of the merge.  I don't
know why my testcase triggers bugs in merge-recursive and your bigger
testcase doesn't, but I wasn't too motivated to find out either; I was
mostly focused on trying to understand the merge-ort side of things.

Now, there's code in both merge-recursive and merge-ort for avoiding
doubly transitive renames (i.e. one side renames directory A/ -> B/,
and the other side renames directory B/ -> C/, and even worse if the
original side also renames C/ -> D/, because this combination would
otherwise make a mess for new files added to A/ on the first side and
wondering which directory they end up in).  However, this is a
testcase where a _leading directory_ of B/ is renamed to C/, which
sidesteps the normal doubly transitive rename check, and then the code
heads down what looks like the wrong path until it is caught by the
assertion check you reported.  In addition to the
funny-doubly-transitive-rename (involving the leading directory), your
testcase also has both sides add a file, one side to A/ and the other
side to B/ (with the same basename but different file contents), which
adds to the "fun".

Anyway, long story short...I don't have a fix yet, but just thought
I'd mention I saw the email and spent some hours digging in.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux