Re: git fsck failure on OS X with files >= 4 GiB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for the feedback.  I have revised the proposed patch as
suggested, allowing the use of SHA1_MAX_BLOCK_SIZE to enable the
chunked implementation.  When building for OSX with the CommonCrypto
library we error out if SHA1_MAX_BLOCK_SIZE is not defined, which will
avoid compiling a version of the tool that won't compute hashes
properly on large files.  It should be easy to enable this on other
platforms if needed.

Atousa

On Thu, Oct 29, 2015 at 10:19 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Atousa Duprat <atousa.p@xxxxxxxxx> writes:
>
>> [PATCH] Limit the size of the data block passed to SHA1_Update()
>>
>> This avoids issues where OS-specific implementations use
>> a 32-bit integer to specify block size.  Limit currently
>> set to 1GiB.
>> ---
>>  cache.h | 20 +++++++++++++++++++-
>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/cache.h b/cache.h
>> index 79066e5..c305985 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -14,10 +14,28 @@
>>  #ifndef git_SHA_CTX
>>  #define git_SHA_CTX SHA_CTX
>>  #define git_SHA1_Init SHA1_Init
>> -#define git_SHA1_Update SHA1_Update
>>  #define git_SHA1_Final SHA1_Final
>>  #endif
>>
>> +#define SHA1_MAX_BLOCK_SIZE (1024*1024*1024)
>> +
>> +static inline int git_SHA1_Update(SHA_CTX *c, const void *data, size_t len)
>> +{
>> + size_t nr;
>> + size_t total = 0;
>> + char *cdata = (char*)data;
>> + while(len > 0) {
>> + nr = len;
>> + if(nr > SHA1_MAX_BLOCK_SIZE)
>> + nr = SHA1_MAX_BLOCK_SIZE;
>> + SHA1_Update(c, cdata, nr);
>> + total += nr;
>> + cdata += nr;
>> + len -= nr;
>> + }
>> + return total;
>> +}
>> +
>
> I think the idea illustrated above is a good start, but there are
> a few issues:
>
>  * SHA1_Update() is used in fairly many places; it is unclear if it
>    is a good idea to inline.
>
>  * There is no need to punish implementations with working
>    SHA1_Update by another level of wrapping.
>
>  * What would you do when you find an implementation for which 1G is
>    still too big?
>
> Perhaps something like this in the header
>
> #ifdef SHA1_MAX_BLOCK_SIZE
> extern int SHA1_Update_Chunked(SHA_CTX *, const void *, size_t);
> #define git_SHA1_Update SHA1_Update_Chunked
> #endif
>
> with compat/sha1_chunked.c that has
>
> #ifdef SHA1_MAX_BLOCK_SIZE
> int SHA1_Update_Chunked(SHA_CTX *c, const void *data, size_t len)
> {
>         ... your looping implementation ...
> }
> #endif
>
> in it, that is only triggered via a Makefile macro, e.g.
> might be a good workaround.
>
> diff --git a/Makefile b/Makefile
> index 8466333..83348b8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -139,6 +139,10 @@ all::
>  # Define PPC_SHA1 environment variable when running make to make use of
>  # a bundled SHA1 routine optimized for PowerPC.
>  #
> +# Define SHA1_MAX_BLOCK_SIZE if your SSH1_Update() implementation can
> +# hash only a limited amount of data in one call (e.g. APPLE_COMMON_CRYPTO
> +# may want 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined).
> +#
>  # Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl (Darwin).
>  #
>  # Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto (Darwin).
> @@ -1002,6 +1006,7 @@ ifeq ($(uname_S),Darwin)
>         ifndef NO_APPLE_COMMON_CRYPTO
>                 APPLE_COMMON_CRYPTO = YesPlease
>                 COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
> +               SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L
>         endif
>         NO_REGEX = YesPlease
>         PTHREAD_LIBS =
> @@ -1350,6 +1355,11 @@ endif
>  endif
>  endif
>
> +ifdef SHA1_MAX_BLOCK_SIZE
> +LIB_OBJS += compat/sha1_chunked.o
> +BASIC_CFLAGS += SHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
> +endif
> +
>  ifdef NO_PERL_MAKEMAKER
>         export NO_PERL_MAKEMAKER
>  endif



-- 
Atousa Pahlevan, PhD
M.Math. University of Waterloo, Canada
Ph.D. Department of Computer Science, University of Victoria, Canada
Voice: 415-341-6206
Email: apahlevan@xxxxxxxx
Website: www.apahlevan.org
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]