Re: Lost file after git merge

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 30.07.22 um 04:16 schrieb Elijah Newren:
> On Fri, Jul 29, 2022 at 1:34 PM René Scharfe <l.s.r@xxxxxx> wrote:
>>
>> Am 28.07.22 um 19:11 schrieb Junio C Hamano:
>>> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>>>
>>>> On Thu, Jul 28 2022, Laďa Tesařík wrote:
>>>>
>>>>> 1. I added a file called 'new_file' to a master branch.
>>>>> 2. Then I created branch feature/2 and deleted the file in master
>>>>> 3. Then I deleted the file in branch feature/2 as well.
>>>>> 4. I created 'new_file' on branch feature/2 again.
>>>
>>> It heavily depends on how this creation is done, i.e. what went into
>>> the created file.  Imagine that a file existed with content A at
>>> commit 0, both commits 1 and 2 removed it on their forked history,
>>> and then commit 3 added exactly the same content A to the same path:
>>>
>>>           1---3
>>>          /     \
>>>     ----0---2---4---->
>>>
>>> When you are about to merge 2 and 3 to create 4, what would a
>>> three-way merge see?
>>>
>>>     0 had content A at path P
>>>     2 said "no we do not want content A at path P"
>>>     3 said "we are happy with content A at path P"
>>>
>>> So the net result is that 0-->3 "one side did not touch A at P" and
>>> 0-->2 "one side removed A at P".
>>>
>>> Three-way merge between X and Y is all about taking what X did if Y
>>> didn't have any opinion on what X touched.  This is exactly that
>>> case.  The history 0--->3 didn't have any opinion on what should be
>>> in P or whether P should exist, and that is why there is no change
>>> between these two endpoints.
>>
>> The last sentence is not necessarily true.  You could also say that
>> 0--->3 cared so much about path P having content A that it brought it
>> back from the void.  Determining whether a de-facto revert
>> - intended to return to an uncaring state of "take whatever main has" or
>> - meant to choose *that* specific content which incidentally is on main
>> is not possible from the snapshots at the merge point alone, I think.
>>
>> Checking if 0...3 touched P and leaving that path unmerged out of
>> caution shouldn't be terribly expensive.
>
> I think it might be terribly expensive.
>
> Walking history can easily be the slow part of such an operation, e.g.
> can_fast_forward() taking roughly 100 times as long as doing the
> merge_incore_recursive() portion that creates the new merged toplevel
> tree[1].  (And can_fast_forward() is a form of history walk that
> doesn't involve traversing into any trees, so I suspect it's a cheaper
> history traversal than what is being suggested).
>
> Focusing on the tree traversal side, this suggested change would
> essentially disable the trivial directory resolution optimizations in
> merge-ort[2].  (Note that the trivial directory resolution sped up a
> rebase that didn't involve very many renames by a factor of 25).  The
> whole point of that optimization was to avoid walking into trees that
> were only changed on one side, where possible.  Your proposed change
> would be saying we always have to walk into trees that either side
> modified...and do so for every intermediate commit as well so that we
> can fully enumerate all (temporarily) changed files.

True: Compared to just checking if a path was touched by 3, a history
traversal can take arbitrarily long.  At least it's bounded by the merge
base and a specific path.  And renames complicate the picture, but only
full renames (same blob or tree ID) need to be considered.  That feels
doable in a reasonable amount of time, but it's not as cheap as ignoring
the history.

Assuming that one side doesn't care about a path because it has the same
content as the merge base is tempting.  And reverts that break this
assumption are probably quite rare.  Still it led to an unintended
outcome here.  Reminds me of a recent chess robot incident [3].  Speed
is nice and safety has a cost, but do we already make the best possible
tradeoff here?

> [1] https://lore.kernel.org/git/CABPp-BE48=97k_3tnNqXPjSEfA163F8hoE+HY0Zvz1SWB2B8EA@xxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/git/pull.988.v4.git.1626841444.gitgitgadget@xxxxxxxxx/

[3] https://www.theguardian.com/sport/2022/jul/24/chess-robot-grabs-and-breaks-finger-of-seven-year-old-opponent-moscow




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux