Re: [PATCH] Teach "git add" and friends to be paranoid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 18 Feb 2010, Junio C Hamano wrote:

> Nicolas Pitre <nico@xxxxxxxxxxx> writes:
> 
> >> Honesty is very good.  An alternative implementation that does not hurt
> >> performance as much as the "paranoia" would, and checks "the input well
> >> enough" would be very welcome.
> >
> > Can't we rely on the mtime of the source file?  Sample it before 
> > starting hashing it, then make sure it didn't change when done.
> 
> I suspect that opening to mmap(2), hashing once to compute the object
> name, and deflating it to write it out, will all happen within the same
> second, unless you are talking about a really huge file, or you started at
> very near a second boundary.

How is the index dealing with this?  Surely if a file is added to the 
index and modified within the same second then 'git status' will fail to 
notice the changes.  I'm not familiar enough with that part of Git.

Alternatively, you could use the initial mtime sample to determine the 
filesystem's time granularity by noticing how many LSBs are zero.  
Let's say FAT should have a granularity of one second.  Then if the 
mtime of the file is less than one second away before starting to hash 
then just wait for one second.  If one second later the mtime has 
changed and still less than a second away then abort.  If after the hash 
the mtime has changed then abort.

On a recent filesystem, it is likely that the mtime granularity is a 
nanosecond.  Nevertheless the above algorithm should just work all the 
same, although it is unlikely that the mtime will be within the current 
nanosecond, hence the probability for having to do an initial wait is 
almost zero.  On kernels without hires timers the granularity will be 
like 10 ms.

Of course you might be unlucky and the initial mtime sample happens to 
be right on a whole second even on a high resolution mtime filesystem, 
in which case the delay test will consider one second instead of 10 ms 
or whatever.  but the probability is rather small that you'll end up 
with all sub-second bits to be all zeroes causing a longer delay than 
actually necessary, and this would matter only for files that would have 
been modified within that second.  I don't think there is a reliable way 
to enquire a filesystem+OS time stamping granularity.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]