Re: 'git add' corrupts repository if the working directory is modified as it runs

Ilari Liusvaara <ilari.liusvaara@xxxxxxxxxxx> · Sat, 13 Feb 2010 18:29:24 +0200

On Sat, Feb 13, 2010 at 03:39:53PM +0100, Thomas Rast wrote:
> On Saturday 13 February 2010 14:39:52 Ilari Liusvaara wrote:
> > On Sat, Feb 13, 2010 at 06:12:38AM -0600, Jonathan Nieder wrote:
> > > 
> > > With the current code, write_sha1_file() will hash the file, notice
> > > that object is already in .git/objects, and return.  With a
> > > read-hash-copy loop, git would have to store a (compressed or
> > > uncompressed) copy of the file somewhere in the meantime.
> > 
> > It could be done by first reading the file and computing hash,
> > if the hash matches existing object, return that hash. Otherwise
> > read the file for object write, hashing it again and use that value
> > for object ID.
> 
> That is still racy.  The real problem is that the file is mmap()ed,
> and git then first computes the SHA1 of that buffer, next it
> compresses it.[*]

Hmm... One needs to copy the data block at time into temporary buffer
and use that for feeding zlib and SHA-1. That ensures that whatever
SHA-1 hashes and zlib compresses are consistent.

-Ilari
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html