Re: malloc fails when dealing with huge files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



i tried to do something like that over a year ago, having gotten the
insane idea that i wanted to version my whole harddrive.  binaries
were a huge problem.

checkouts were also a problem over slow connections because there is
no git-clone --resume, so if your connection is interrupted, you're
back at square one.  perhaps git-torrent will fix that.

git wasn't supposed to be file based, as much as line/code based.  let
me know if you find a better alternative to git for filesystems.

it's too bad there's not a better way to keep resources tagged to a
version by a sha1, but keep source separate.

On Wed, Dec 10, 2008 at 11:32 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>
> On Wed, 10 Dec 2008, Jonathan Blanton wrote:
>>
>> I'm using Git for a project that contains huge (multi-gigabyte) files.
>>  I need to track these files, but with some of the really big ones,
>> git-add aborts with the message "fatal: Out of memory, malloc failed".
>
> git is _really_ not designed for huge files.
>
> By design - good or bad - git does pretty much all single file operations
> with the whole file in memory as one single allocation.
>
> Now, some of that is hard to fix - or at least would generate much more
> complex code. The _particular_ case of "git add" could be fixed without
> undue pain, but it's not entirely trivial either.
>
> The main offender is probably "index_fd()" that just mmap's the whole file
> in one go and then calls write_sha1_file() which really expects it to be
> one single memory area both for the initial SHA1 create and for the
> compression and writing out of the result.
>
> Changing that to do big files in pieces would not be _too_ painful, but
> it's not just a couple of lines either.
>
> However, git performance with big files would never be wonderful, and
> things like "git diff" would still end up reading not just the whole file,
> but _both_versions_ at the same time. Marking the big files as being
> no-diff might help, though.
>
>
>                        Linus
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux