Re: [PATCH] sha1_file: don't malloc the whole compressed result when writing out objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Nicolas Pitre <nico@xxxxxxxxxxx> writes:
>
>> And what real life case would trigger this?  Given the size of the 
>> window for this to happen, what are your chances?
>
>> Of course the odds for me to be struck by lightning also exist.  And if 
>> I work really really hard at it then I might be able to trigger that 
>> pathological case above even before the next thunderstorm.  But in 
>> practice I'm hardly concerned by either of those possibilities.
>
> The real life case for any of this triggers for me is zero, as I won't be
> mistreating git as a continuous & asynchronous back-up tool.
>
> But then that would make the whole discussion moot.  There are people who
> file "bug reports" with an artificial reproduction recipe built around a
> loop that runs dd continuously overwriting a file while "git add" is asked
> to add it.

Having said all that, I like your approach better.  It is not worth paying
the price of unnecessary memcpy(3) that would _only_ help catching the
insanely artificial test case, but your patch strikes a good balance of
small overhead to catch the easier-to-trigger (either by stupidity, malice
or mistake) cases.

So I am tempted to discard the "paranoia" patch, and replace with your two
patches, with the following caveats in the log message.

--- /var/tmp/2	2010-02-21 22:23:30.000000000 -0800
+++ /var/tmp/1	2010-02-21 22:23:22.000000000 -0800
@@ -21,7 +21,9 @@
     deflate operation has consumed that data, and make sure it matches
     with the expected SHA1.  This way we can rely on the CRC32 checked by
     the inflate operation to provide a good indication that the data is still
-    coherent with its SHA1 hash.
+    coherent with its SHA1 hash.  One pathological case we ignore is when
+    the data is modified before (or during) deflate call, but changed back
+    before it is hashed.
     
     There is some overhead of course. Using 'git add' on a set of large files:
     
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]