Re: Broken branch after git commit - tracked files in staging area can't be removed with restore --staged, or commit or stash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 08, 2020 at 11:02:41AM +0100, Torsten Krah wrote:

> Hi Jeff, I have a poc you can try:

Thanks, I can see the problem now.

> cd /tmp
> mkdir testrepo
> cd testrepo
> touch TEST1 TEST2
> git add -A
> git commit -m First
> touch TEST3 TEST4
> git add -A
> git commit -m Second
> git reset --soft HEAD~1

OK, so we'd expect to still see TEST3 and TEST4 in the index. And we do:

> git status
> 
>    Auf Branch master
>    Zum Commit vorgemerkte Änderungen:
>      (benutzen Sie "git restore --staged <Datei>..." zum Entfernen aus
>    der Staging-Area)
>    	neue Datei:     TEST3
>    	neue Datei:     TEST4

And then if we try to unstage one of them, we should see that:

> git restore --staged TEST3
> 
>    [10:57:26][tkrah@torstenknbl:/tmp/testrepo]  (master) $ LC_ALL=C git
>    status
>    On branch master
>    Changes to be committed:
>      (use "git restore --staged <file>..." to unstage)
>    	new file:   TEST4
> 
>    Untracked files:
>      (use "git add <file>..." to include in what will be committed)
>    	TEST3

So far so good. But I think there is a lurking problem in the index,
because...

> git commit -m Second
> 
>    [master 5b62331] Second
>     2 files changed, 0 insertions(+), 0
>    deletions(-)
>     create mode 100644 TEST3
>     create mode 100644 TEST4

This is very wrong (and is a bug, not a problem with what you're doing).
We wrote a commit with TEST3 in it, even though it wasn't in the index.
Why would we do that? Almost certainly it's because of a cache-tree
extension in the index. Presumably "git restore --staged" is not
properly invalidating the cache-tree entry. And that would explain what
you see next:

> And now TEST3 is in the commit and what is even more "interesting" is
> the next one:
> 
>    [10:59:16][tkrah@torstenknbl:/tmp/testrepo]  (master) $ LC_ALL=C git
> status
>    On branch master
>    Changes to be committed:
>      (use "git restore --staged <file>..." to unstage)
>    	deleted:    TEST3
> 
>    Untracked files:
>      (use "git add <file>..." to include in what will be committed)
>    	TEST3
> 
> TEST3 is unstaged and deleted now.

That happens because we wrote out a commit with TEST3, but it's still
not in the index! So it appears as if a deletion was staged (remember
that what is "staged" is really just the diff between the index and
HEAD).

We can repeat the same process, substituting "git reset -- TEST3" for
git-restore, and it works fine. So the problem is in git-restore itself.

Here's another observation. If we do:

  git restore --staged TEST3 TEST4

then running "git commit" won't do anything, since there's nothing
staged (and this is why I had trouble reproducing the problem earlier).
If we add a new file, like:

  echo whatever >new
  git add new

then the problem won't exhibit, because that will also invalidate the
cache-tree (because it has to account for "new"). But we could put those
files in their own subdirectory and do:

  $ git restore --staged subdir/TEST3
  $ git status
  On branch master
  Changes to be committed:
  	new file:   subdir/TEST4
  
  Untracked files:
  	subdir/TEST3

and if I commit that, I see the bug again:

  $ git commit -m second
  [master 5c3b8c1] second
   2 files changed, 0 insertions(+), 0 deletions(-)
   create mode 100644 subdir/TEST3
   create mode 100644 subdir/TEST4

Now here's even more fun. I wanted to do another test, so I tried to
reset back to the original "second" commit, but it segfaults! Yikes.

  $ git reset --hard HEAD@{2}
  Segmentation fault

Running it under valgrind shows we're indeed seeing a bogus cache-tree:

  ==2214443== Invalid read of size 4
  ==2214443==    at 0x362DFF: traverse_by_cache_tree (unpack-trees.c:734)
  ==2214443==    by 0x3630A3: traverse_trees_recursive (unpack-trees.c:799)
  ==2214443==    by 0x364209: unpack_callback (unpack-trees.c:1258)
  ==2214443==    by 0x35F925: traverse_trees (tree-walk.c:497)
  ==2214443==    by 0x364DDD: unpack_trees (unpack-trees.c:1589)
  ==2214443==    by 0x1BE6CF: reset_index (reset.c:95)
  ==2214443==    by 0x1BF894: cmd_reset (reset.c:421)
  ==2214443==    by 0x124868: run_builtin (git.c:444)
  ==2214443==    by 0x124BAE: handle_builtin (git.c:674)
  ==2214443==    by 0x124E24: run_argv (git.c:741)
  ==2214443==    by 0x1252AB: cmd_main (git.c:872)
  ==2214443==    by 0x1E811A: main (common-main.c:52)
  ==2214443==  Address 0x40 is not stack'd, malloc'd or (recently) free'd
  ==2214443== 
  ==2214443== Process terminating with default action of signal 11 (SIGSEGV)

This isn't the same spot that Emily fixed earlier today in [1], but it
seems like a very similar problem. Weird.

So instead I blew away the index and then did hard reset:

  $ rm .git/index
  $ git reset --hard
  HEAD is now at 5c3b8c1 second

And now we can see what happens if both files are unstaged, and then we
add a new file, and then commit:

  $ git reset --soft HEAD^
  $ git restore --staged subdir/TEST3 subdir/TEST4
  $ echo whatever >new
  $ git add new
  $ git commit -m again
  [master 119d9e4] again
   1 file changed, 1 insertion(+)
   create mode 100644 new

So that _doesn't_ exhibit the problem. I guess because adding the new
file causes git-add to reexamine the whole cache-tree and clean it up.

So there seem to be at least two bugs:

 - git-restore doesn't properly invalidate the cache-tree

 - the index-reading code is not careful enough about bogus cache-trees,
   and may segfault

-Peff

[1] https://lore.kernel.org/git/20200108023127.219429-1-emilyshaffer@xxxxxxxxxx/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux