Re: Segfaults in git rebase --continue and git rerere

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 09 2021, brian m. carlson wrote:

> [[PGP Signed Part:Undecided]]
> I'm having a bit of a problem with segfaults using
> 2.33.0.252.g9b09ab0cd71.  I was in the process of running "git rebase
> --continue" with that version to resolve some conflicts in a project I'm
> working on.  At that point, it segfaulted, and I got this (apologies for
> the French):
>
>   error: impossible d'analyser la section en conflit dans 'scutiger-lfs/src/bin/git-lfs-transfer.rs'
>   error: impossible d'analyser la section en conflit dans 'scutiger-lfs/src/bin/git-lfs-transfer.rs'
>   Pré-image enregistrée pour 'scutiger-lfs/src/bin/git-lfs-transfer.rs'
>   [HEAD détachée 5134185] scutiger-lfs: move BatchItem, Mode, and Oid to library
>    2 files changed, 74 insertions(+), 9 deletions(-)
>   Fusion automatique de scutiger-lfs/src/processor.rs
>   Fusion automatique de scutiger-lfs/src/bin/git-lfs-transfer.rs
>   CONFLIT (contenu) : Conflit de fusion dans scutiger-lfs/src/bin/git-lfs-transfer.rs
>   error: impossible d'appliquer 2cd1615... scutiger-lfs: move download and verify actions into backend
>   Resolve all conflicts manually, mark them as resolved with
>   "git add/rm <conflicted_files>", then run "git rebase --continue".
>   You can instead skip this commit: run "git rebase --skip".
>   To abort and get back to the state before "git rebase", run "git rebase --abort".
>   error: impossible d'analyser la section en conflit dans 'scutiger-lfs/src/bin/git-lfs-transfer.rs'
>   zsh: segmentation fault  git rebase --continue
>
> I can provide a tarball of the broken repo upon request.  It's 108 MB,
> so you'll need to provide some place for me to drop it.
>
> At that point, I needed to remove .git/MERGE_RR.lock (which is empty),
> and I ran "git rebase --abort" (because I realized my resolution was
> bad).  Upon which, I received a second segfault (traceback from tip of
> next):
>
>   #0  0x00000000005b6e6b in has_rerere_resolution (id=0x229b3a0) at rerere.c:162
>   162             return ((id->collection->status[variant] & both) == both);
>   
>   #0  0x00000000005b6e6b in has_rerere_resolution (id=0x229b3a0) at rerere.c:162
>           both = 3
>           variant = 0
>   #1  rerere_clear (r=0x722880 <the_repo>, merge_rr=merge_rr@entry=0x7ffe8c703810) at rerere.c:1239
>           id = 0x229b3a0
>           i = 0
>
> It appears to be because id->collection->status here is NULL.  It's
> unclear to me why that's the case, but it does appear to be linked to my
> .git/MERGE_RR file, which looks like this (xxd -g1):
>
>   00000000: 30 34 39 62 31 34 32 39 34 65 64 30 63 30 65 66  049b14294ed0c0ef
>   00000010: 62 65 39 32 66 34 66 64 33 31 31 63 37 63 34 30  be92f4fd311c7c40
>   00000020: 39 30 39 34 65 63 64 65 09 73 63 75 74 69 67 65  9094ecde.scutige
>   00000030: 72 2d 6c 66 73 2f 73 72 63 2f 62 69 6e 2f 67 69  r-lfs/src/bin/gi
>   00000040: 74 2d 6c 66 73 2d 74 72 61 6e 73 66 65 72 2e 72  t-lfs-transfer.r
>   00000050: 73 00                                            s.
>
> The corresponding directory under .git/rr-cache is empty.  This
> specifically seems to be the problem, and I've included a snippet of a
> test below that causes the same segfault.  My guess is that the original
> segfault left the MERGE_RR file present but the files in the rr-cache
> directory absent.
>
> Since rerere isn't a strong suit of mine, I'm not sending a patch, but I
> am including a failing test[0] which indicates the problem (and to which
> anyone is welcome to add my sign-off) in hopes that someone more
> familiar with this subsystem can figure out what's going on.

I haven't taken time to familiarize myself with it, but I have one
segfault fix for it already[1] which needs a test, I tried and this is
completely unrelated.

The "fix" for this segfault you're reporting could be, this makes your
test pass, along with the rest of the test suite:

diff --git a/rerere.c b/rerere.c
index d83d58df4fb..cbad6a89ebc 100644
--- a/rerere.c
+++ b/rerere.c
@@ -157,6 +157,9 @@ static int has_rerere_resolution(const struct rerere_id *id)
 	const int both = RR_HAS_POSTIMAGE|RR_HAS_PREIMAGE;
 	int variant = id->variant;
 
+	if (variant == id->collection->status_nr)
+		return 1;
+
 	if (variant < 0)
 		return 0;
 	return ((id->collection->status[variant] & both) == both);

But I have no idea if that's sensible. I.e. as you found the immediate
cause here is that our "id" is being looked up in
"id->collection->status" where that's NULL, there's similar checks
elsewhere for "id->collection->status_nr", but if that's correct here I
don't know, nor what the deeper logic error is of how the two went out
of sync.

1. https://lore.kernel.org/git/87v96p4w3f.fsf@xxxxxxxxxxxxxxxxxxx/

> ----- %< -----
> diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
> index 9f8c76dffb..c44dfe248a 100755
> --- a/t/t4200-rerere.sh
> +++ b/t/t4200-rerere.sh
> @@ -623,6 +623,7 @@ test_expect_success 'rerere with inner conflict markers' '
>  	git commit -q -m "will solve conflicts later" &&
>  	test_must_fail git merge A &&
>  
> +	cp .git/MERGE_RR merge_rr &&
>  	echo "resolved" >test &&
>  	git add test &&
>  	git commit -q -m "solved conflict" &&
> @@ -645,6 +646,13 @@ test_expect_success 'rerere with inner conflict markers' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'rerere clear does not segfault with bad data' '
> +	res_id=$($PERL_PATH -nF"\t" -e "print \$F[0]" merge_rr) &&
> +	cp merge_rr .git/MERGE_RR &&
> +	rm -f .git/rr-cache/$res_id/* &&
> +	git rerere clear
> +'
> +
>  test_expect_success 'setup simple stage 1 handling' '
>  	test_create_repo stage_1_handling &&
>  	(
> ----- %< -----
>
> As an aside, I generally find Git's codebase to be of exceptionally good
> quality for a C program, and so seeing two segfaults back to back led me
> to downgrade my recently upgraded version of glibc to see if somehow
> that was the problem.  Unfortunately, that was not the case, and we just
> have two separate bugs here.
>
> [0] The test uses perl because the MERGE_RR file contains a NUL byte and
> therefore cannot be used with standard POSIX utilities.

(Just a side-note, I think the use of perl here is fine)

I haven't tried, but I think you can probably do that res_id assignment
with "git grep" and -o, see t7816-grep-binary-pattern.sh for how we can
support greps with \0 in them with -f.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux