Re: [PATCH v3 06/11] rerere: fix crash when conflict goes unresolved

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:

> Currently when a user doesn't resolve a conflict in a file, but
> commits the file with the conflict markers, and later the file ends up
> in a state in which rerere can't handle it, subsequent rerere
> operations that are interested in that path, such as 'rerere clear' or
> 'rerere forget <path>' will fail, or even worse in the case of 'rerere
> clear' segfault.
>
> Such states include nested conflicts, or an extra conflict marker that
> doesn't have any match.
>
> This is because the first 'git rerere' when there was only one
> conflict in the file leaves an entry in the MERGE_RR file behind.  The

I find this sentence, especially the "only one conflict in the file"
part, a bit unclear.  What does the sentence count as one conflict?
One block of lines enclosed inside "<<<"...">>>" pair?  The command
behaves differently when there are two such blocks instead?

> next 'git rerere' will then pick the rerere ID for that file up, and
> not assign a new ID as it can't successfully calculate one.  It will
> however still try to do the rerere operation, because of the existing
> ID.  As the handle_file function fails, it will remove the 'preimage'
> for the ID in the process, while leaving the ID in the MERGE_RR file.
>
> Now when 'rerere clear' for example is run, it will segfault in
> 'has_rerere_resolution', because status is NULL.

I think this "status" refers to the collection->status[].  How do we
get into that state, though?

new_rerere_id() and new_rerere_id_hex() fills id->collection by
calling find_rerere_dir(), which either finds an existing rerere_dir
instance or manufactures one with .status==NULL.  The .status[]
array is later grown by calling fit_variant as we scan and find the
pre/post images, but because there is no pre/post image for a file
with unparseable conflicts, it is left NULL.

So another possible fix could be to make sure that .status[] is only
read when .status_nr says there is something worth reading.  I am
not saying that would be a better fix---I am just thinking out loud
to make sure I understand the issue correctly.

> To fix this, remove the rerere ID from the MERGE_RR file in the case
> when we can't handle it, and remove the corresponding variant from
> .git/rr-cache/.  Removing it unconditionally is fine here, because if
> the user would have resolved the conflict and ran rerere, the entry
> would no longer be in the MERGE_RR file, so we wouldn't have this
> problem in the first place, while if the conflict was not resolved,
> the only thing that's left in the folder is the 'preimage', which by
> itself will be regenerated by git if necessary, so the user won't
> loose any work.

s/loose/lose/

> Note that other variants that have the same conflict ID will not be
> touched.

Nice.  Thanks for a fix.

>
> Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx>
> ---
>  rerere.c          | 12 +++++++-----
>  t/t4200-rerere.sh | 22 ++++++++++++++++++++++
>  2 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/rerere.c b/rerere.c
> index da1ab54027..895ad80c0c 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -823,10 +823,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
>  		struct rerere_id *id;
>  		unsigned char sha1[20];
>  		const char *path = conflict.items[i].string;
> -		int ret;
> -
> -		if (string_list_has_string(rr, path))
> -			continue;
> +		int ret, has_string;
>  
>  		/*
>  		 * Ask handle_file() to scan and assign a
> @@ -834,7 +831,12 @@ static int do_plain_rerere(struct string_list *rr, int fd)
>  		 * yet.
>  		 */
>  		ret = handle_file(path, sha1, NULL);
> -		if (ret < 1)
> +		has_string = string_list_has_string(rr, path);
> +		if (ret < 0 && has_string) {
> +			remove_variant(string_list_lookup(rr, path)->util);
> +			string_list_remove(rr, path, 1);
> +		}
> +		if (ret < 1 || has_string)
>  			continue;

We used to say "if we know about the path we do not do anything
here, if we do not see any conflict in the file we do nothing,
otherwise we assign a new id"; we now say "see if we can parse
and also see if we have conflict(s); if we know about the path and
we cannot parse, drop it from the rr database (because otherwise the
entry will cause us trouble elsewhere later).  Otherwise, if we do
not have any conflict or we already know about the path, no need to
do anything. Otherwise, i.e. a newly discovered path with conflicts
gets a new id".

Makes sense.  "A known path with unparseable conflict gets dropped"
is the important change in this hunk.

> diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
> index 8417e5a4b1..34f0518a5e 100755
> --- a/t/t4200-rerere.sh
> +++ b/t/t4200-rerere.sh
> @@ -580,4 +580,26 @@ test_expect_success 'multiple identical conflicts' '
>  	count_pre_post 0 0
>  '
>  
> +test_expect_success 'rerere with unexpected conflict markers does not crash' '
> +	git reset --hard &&
> +
> +	git checkout -b branch-1 master &&
> +	echo "bar" >test &&
> +	git add test &&
> +	git commit -q -m two &&
> +
> +	git reset --hard &&
> +	git checkout -b branch-2 master &&
> +	echo "foo" >test &&
> +	git add test &&
> +	git commit -q -a -m one &&
> +
> +	test_must_fail git merge branch-1 &&
> +	sed "s/bar/>>>>>>> a/" >test.tmp <test &&
> +	mv test.tmp test &&

OK, so the "only one conflict" in the log message meant just one
side of the conflict marker.  More generally, the troublesome is
to have "conflict marker(s) that cannot be parsed" in the file.

> +	git rerere &&
> +
> +	git rerere clear
> +'
> +
>  test_done



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux