Re: large files and low memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 4, 2010 at 11:24 AM, Joshua Jensen
<jjensen@xxxxxxxxxxxxxxxxx> wrote:
>> On Mon, Oct 4, 2010 at 2:20 AM, Enrico Weigelt<weigelt@xxxxxxxx>  wrote:
>>>
>>> when adding files which are larger than available physical memory,
>>> git performs very slow.
>>
>> The mmap() isn't the problem.  Its the allocation of a buffer that is
>> larger than the file in order to hold the result of deflating the file
>> before it gets written to disk.
...
>> This is a known area in Git where big files aren't handled well.
>
> As a curiosity, I've always done streaming decompression with zlib using
> minimal buffer sizes (64k, perhaps).  I'm sure there is good reason why Git
> doesn't do this (delta application?).  Do you know what it is?

Laziness.  Git originally assumed it would only be used for smaller
source files written by humans.  Its easier to write the code as a
single malloc'd buffer than to stream it.  We'd like to fix it, but
its harder than it sounds.  Today we copy the file into a buffer
before we deflate and compute the SHA-1 as this prevents us from
getting into a consistency error when the file is modified between
these two stages.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]