Re: [PATCH 2/2] index-pack: reduce memory usage when the pack has large blobs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 24, 2012 at 07:23:21PM +0700, Nguyễn Thái Ngọc Duy wrote:
> This command unpacks every non-delta objects in order to:
> 
> 1. calculate sha-1
> 2. do byte-to-byte sha-1 collision test if we happen to have objects
>    with the same sha-1
> 3. validate object content in strict mode
> 
> All this requires the entire object to stay in memory, a bad news for
> giant blobs. This patch lowers memory consumption by not saving the
> object in memory whenever possible, calculating SHA-1 while unpacking
> the object.
> 
> This patch assumes that the collision test is rarely needed. The
> collision test will be done later in second pass if necessary, which
> puts the entire object back to memory again (We could even do the
> collision test without putting the entire object back in memory, by
> comparing as we unpack it).
> 
> In strict mode, it always keeps non-blob objects in memory for
> validation (blobs do not need data validation). "--strict --verify"
> also keeps blobs in memory.

I applied both patches to git master, with some manual tinkering so i
might have missed some change that caused this to break.

But i get a segmentation fault and i just thought that i'd send you a
small trace before i even start trying to look in to this:
0xb7eb5b43 in SHA1_Update () from /lib/i686/cmov/libcrypto.so.0.9.8
(gdb) bt
#0  0xb7eb5b43 in SHA1_Update () from /lib/i686/cmov/libcrypto.so.0.9.8
#1  0x08116a2d in write_sha1_file_prepare
#2  0x08116a83 in hash_sha1_file
#3  0x0807c2a6 in sha1_object 
#4  0x0807d74a in parse_pack_objects
#5  0x0807de6f in cmd_index_pack 
#6  0x0804be97 in run_builtin 
#7  handle_internal_command 
#8  0x0804c0ad in run_argv 
#9  main

Sorry about the censorship but i don't know how sensetive this data
is...


sha1_file.c:2343
---
static void write_sha1_file_prepare(const void *buf, unsigned long len,
                                    const char *type, unsigned char *sha1,
                                    char *hdr, int *hdrlen)
{
        git_SHA_CTX c;

        /* Generate the header */
        *hdrlen = sprintf(hdr, "%s %lu", type, len)+1;

        /* Sha1.. */
        git_SHA1_Init(&c);
        git_SHA1_Update(&c, hdr, *hdrlen);
        git_SHA1_Update(&c, buf, len); <== this line fails.
        git_HA1_Final(sha1, &c);
}
---

Just keep sending patches, i have atleast one git to test it on. ;)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]