Re: [PATCH 03/19] Ensure index matches head before invoking merge machinery, round N

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elijah,

On Thu, 25 Jul 2019, Elijah Newren wrote:

> This is the bug that just won't die; there always seems to be another
> form of it somewhere.  See the commit message of 55f39cf7551b ("merge:
> fix misleading pre-merge check documentation", 2018-06-30) for a more
> detailed explanation), but in short:
>
> <quick summary>
>
> builtin/merge.c contains this important requirement for merge
> strategies:
>
>     ...the index must be in sync with the head commit.  The strategies are
>     responsible to ensure this.
>
> This condition is important to enforce because there are two likely
> failure cases when the index isn't in sync with the head commit:
>
>   * we silently throw away changes the user had staged before the merge
>
>   * we accidentally (and silently) include changes in the merge that
>     were not part of either of the branches/trees being merged
>
> Discarding users' work and mis-merging are both bad outcomes, especially
> when done silently, so naturally this rule was stated sternly -- but,
> unfortunately totally ignored in practice unless and until actual bugs
> were found.  But, fear not: the bugs from this were fixed in commit
>   ee6566e8d70d ("[PATCH] Rewrite read-tree", 2005-09-05)
> through a rewrite of read-tree (again, commit 55f39cf7551b has a more
> detailed explanation of how this affected merge).  And it was fixed
> again in commit
>   160252f81626 ("git-merge-ours: make sure our index matches HEAD", 2005-11-03)
> ...and it was fixed again in commit
>   3ec62ad9ffba ("merge-octopus: abort if index does not match HEAD", 2016-04-09)
> ...and again in commit
>   65170c07d466 ("merge-recursive: avoid incorporating uncommitted changes in a merge", 2017-12-21)
> ...and again in commit
>   eddd1a411d93 ("merge-recursive: enforce rule that index matches head before merging", 2018-06-30)
>
> ...with multiple testcases added to the testsuite that could be
> enumerated in even more commits.
>
> Then, finally, in a patch in the same series as the last fix above, the
> documentation about this requirement was fixed in commit 55f39cf7551b
> ("merge: fix misleading pre-merge check documentation", 2018-06-30), and
> we all lived happily ever after...
>
> </quick summary>

Whoa. What a story.

> Unfortunately, "ever after" apparently denotes a limited time and it
> expired today.  The merge-recursive rule to enforce that index matches
> head was at the beginning of merge_trees() and would only trigger when
> opt->call_depth was 0.  Since merge_recursive() doesn't call
> merge_trees() until after returning from recursing, this meant that the
> check wasn't triggered by merge_recursive() until it had first finished
> all the intermediate merges to create virtual merge bases.  That is a
> potentially HUGE amount of computation (and writing of intermediate
> merge results into the .git/objects directory) before it errors out and
> says, in effect, "Sorry, I can't do any merging because you have some
> local changes that would be overwritten."
>
> Trying to enforce that all of merge_trees(), merge_recursive(), and
> merge_recursive_generic() checked the index == head condition earlier
> resulted in a bunch of broken tests.  It turns out that
> merge_recursive() has code to drop and reload the cache while recursing
> to create intermediate virtual merge bases, but unfortunately that code
> runs even when no recursion is necessary.  This unconditional dropping
> and reloading of the cache masked a few bugs:
>
>   * builtin/merge-recursive.c: didn't even bother loading the index.
>
>   * builtin/stash.c: feels like a fake 'builtin' because it repeatedly
>     invokes git subprocesses all over the place, mixed with other
>     operations.  In particular, invoking "git reset" will reset the
>     index on disk, but the parent process that invoked it won't
>     automatically have its in-memory index updated.

Yep, the idea was to move fast to a built-in, and then continue by
converting all those process spawns to proper calls to libgit "API"
functions.

Sadly, that did not happen yet.

And you're absolutely right that failing to re-read the index after
spawning a `reset --hard` causes problems. IIRC I fixed them all,
though, see
https://public-inbox.org/git/nycvar.QRO.7.76.6.1902191127420.41@xxxxxxxxxxxxxxxxx/

> So, load the index in builtin/merge-recursive.c, reload the in-memory
> index in builtin/stash.c, and modify the t3030 testcase to correctly
> setup the index and make sure that the test fails in the expected way
> (meaning it reports a rename/rename conflict).

Makes sense to me.

> diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
> index 5b910e351e..a4bfd8fc51 100644
> --- a/builtin/merge-recursive.c
> +++ b/builtin/merge-recursive.c
> @@ -1,3 +1,4 @@
> +#include "cache.h"
>  #include "builtin.h"
>  #include "commit.h"
>  #include "tag.h"
> @@ -63,6 +64,9 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
>  	if (argc - i != 3) /* "--" "<head>" "<remote>" */
>  		die(_("not handling anything other than two heads merge."));
>
> +	if (repo_read_index_unmerged(the_repository))
> +		die_resolve_conflict("merge");

For a moment I was unsure whether `_unmerged()` is the right thing to do
here, as it specifically allows to read the index even when there are
conflict stages. But I guess it does not matter too much here. I
probably would have opted for `repo_read_index()` instead, though.

> +
>  	o.branch1 = argv[++i];
>  	o.branch2 = argv[++i];
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index fde6397caa..bec011c1bb 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -427,6 +427,8 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
>  				return error(_("could not save index tree"));
>
>  			reset_head();
> +			discard_cache();
> +			read_cache();

I was honestly puzzled why this is necessary, at first. The preceding
context expands to this:

                        discard_cache();
                        read_cache();
                        if (write_cache_as_tree(&index_tree, 0, NULL))
                                return error(_("could not save index tree"));

So basically, we already discard the index, read it again, then write it
as a tree, then reset and then we have to discard the index again?

But of course, if there are uncommitted changes, this would write a tree
different from HEAD, then reset the index to match HEAD, so indeed, this
discard/read dance is necessary.

So this hunk is good.

>  		}
>  	}
>
> diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
> index ff641b348a..a37bcc58a0 100755
> --- a/t/t3030-merge-recursive.sh
> +++ b/t/t3030-merge-recursive.sh
> @@ -667,15 +667,22 @@ test_expect_success 'merging with triple rename across D/F conflict' '
>  test_expect_success 'merge-recursive remembers the names of all base trees' '
>  	git reset --hard HEAD &&
>
> +	# make the index match $c1 so that merge-recursive below does not
> +	# fail early
> +	git diff --binary HEAD $c1 -- | git apply --cached &&
> +
>  	# more trees than static slots used by oid_to_hex()
>  	for commit in $c0 $c2 $c4 $c5 $c6 $c7
>  	do
>  		git rev-parse "$commit^{tree}"
>  	done >trees &&
>
> -	# ignore the return code -- it only fails because the input is weird
> +	# ignore the return code; it only fails because the input is weird...
>  	test_must_fail git -c merge.verbosity=5 merge-recursive $(cat trees) -- $c1 $c3 >out &&
>
> +	# ...but make sure it fails in the expected way
> +	test_i18ngrep CONFLICT.*rename/rename out &&
> +

This is obviously a good change: it strengthens the test case by fixing
a subtle bug.

Thanks,
Dscho

>  	# merge-recursive prints in reverse order, but we do not care
>  	sort <trees >expect &&
>  	sed -n "s/^virtual //p" out | sort >actual &&
> --
> 2.22.0.559.g28a8880890.dirty
>
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux