Re: 'git add' corrupts repository if the working directory is modified as it runs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Saturday 13 February 2010 14:39:52 Ilari Liusvaara wrote:
> On Sat, Feb 13, 2010 at 06:12:38AM -0600, Jonathan Nieder wrote:
> > 
> > With the current code, write_sha1_file() will hash the file, notice
> > that object is already in .git/objects, and return.  With a
> > read-hash-copy loop, git would have to store a (compressed or
> > uncompressed) copy of the file somewhere in the meantime.
> 
> It could be done by first reading the file and computing hash,
> if the hash matches existing object, return that hash. Otherwise
> read the file for object write, hashing it again and use that value
> for object ID.

That is still racy.  The real problem is that the file is mmap()ed,
and git then first computes the SHA1 of that buffer, next it
compresses it.[*]

Due to the last sentence in the following snippet from mmap(2):

  MAP_PRIVATE
           Create a private copy-on-write mapping.  Updates to the map-
           ping are not visible to other  processes  mapping  the  same
           file,  and  are  not carried through to the underlying file.
           It is unspecified whether changes made to the file after the
           mmap() call are visible in the mapped region.

This is racy despite the use of MAP_PRIVATE: the mapped contents can
change at any time.

AFAICS there are only two possible solutions:

* Copy the file (possibly block-by-block) as we go, to make sure that
  the data we SHA1 is the same we compress.

* Unpack and re-hash the compressed data to verify that the SHA1 is
  correct.  In case of failure either retry (but you could have to do
  this infinitely often if the user just hates you!) or abort.

(Of course, in neither case does the user have any sort of guarantee
about what data ended up in the repository, but he never had that, we
only try to ensure repo consistency.)


[*] The "do we have this" check actually happens before the
compression, and that arm is thus race-free.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]