Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs

Jonathan Nieder <jrnieder@xxxxxxxxx> · Thu, 11 Feb 2010 18:27:41 -0600

Hi gitsters,

Zygo Blaxell reported through http://bugs.debian.org/569505 that ‘git
update-index’ has some issues when the files it is adding change under
its feet:

My thoughts:

 - Low-hanging fruit: it should be possible for update-index to check
   the stat information to see if the file has changed between when it
   first opens it and when it finishes.

 - Zygo reported suppress that ‘git gc’ didn’t notice the problem.
   Should ‘git gc’ imply a ‘git fsck --no-full’?

 - Recovering from this kind of mistake in early history is indeed
   hard.  Any tricks for doing this?  Maybe fast-export | fast-import
   can do something with this, or maybe replace + filter-branch once
   it learns to be a little smarter.

 - How do checkout-index and cat-file blob react to a blob whose
   contents do not reflect its object name?  Are they behaving
   appropriately?  I would want cat-file blob to be able to retrieve
   such a broken blob’s contents, checkout-index not so much.

I imagine there are other things to learn, too.  The report and
reproduction recipe follow.

Thoughts?
Jonathan

Package: git-core
Version: 1:1.6.6.1-1
Severity: important

'git add' will happily corrupt a git repo if it is run while files in
the working directory are being modified.  A blob is added to the index
with contents that do not match its SHA1 hash.  If the index is then
committed, the corrupt blob cannot be checked out (or is checked out
with incorrect contents, depending on which tool you use to try to get
the file out of git) in the future.

Surprisingly, it's possible to clone, fetch, push, pull, and sometimes
even gc the corrupted repo several times before anyone notices the
corruption.  If the affected commit is included in a merge with history
from other git users, the only way to fix it is to rebase (or come up
with a blob whose contents match the affected SHA1 hash somehow).

It is usually possible to retrieve data committed before the corruption
by simply checking out an earlier tree in the affected branch's history.

The following shell code demonstrates this problem.  It runs a thread
which continuously modifies a file, and another thread that does
'git commit -am' and 'git fsck' in a continuous loop until corruption
is detected.  This might take up to 20 seconds on a slow machine.

	#!/bin/sh
	set -e

	# Create an empty git repo in /tmp/git-test
	rm -fr /tmp/git-test
	mkdir /tmp/git-test
	cd /tmp/git-test
	git init

	# Create a file named foo and add it to the repo
	touch foo
	git add foo

	# Thread 1:  continuously modify foo:
	while echo -n .; do
		dd if=/dev/urandom of=foo count=1024 bs=1k conv=notrunc >/dev/null 2>&1
	done &

	# Thread 2:  loop until the repo is corrupted
	while git fsck; do
		# Note the implied 'git add' in 'commit -a'
		# It will do the same with explicit 'git add'
		git commit -a -m'Test'
	done

	# Kill thread 1, we don't need it any more
	kill $!

	# Success!  Well, sort of.
	echo Repository is corrupted.  Have a nice day.

I discovered this bug accidentally when I was using inotifywait (from
the inotify-tools package) to automatically commit snapshots of a working
directory triggered by write events.

I tested this with a number of kernel versions from 2.6.27 to 2.6.31.
All of them reproduced this problem.  I checked this because strace
shows 'git add' doing a mmap(..., MAP_PRIVATE, ...) of its input file,
so I was wondering if there might have been a recent change in mmap()
behavior in either git or the kernel.

git 1.5.6.5 has this problem too, but some of the error messages are
different, and the problem sometimes manifests itself as silent corruption
of other objects (e.g. if someone checks out a corrupt tree and then does
'git add -u' or 'git commit -a', they will include the corrupt data in
their commit).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html