Re: "malloc failed"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



on Tue Jan 27 2009, "Shawn O. Pearce" <spearce-AT-spearce.org> wrote:

> David Abrahams <dave@xxxxxxxxxxxx> wrote:
>> I've been abusing Git for a purpose it wasn't intended to serve:
>> archiving a large number of files with many duplicates and
>> near-duplicates.  Every once in a while, when trying to do something
>> really big, it tells me "malloc failed" and bails out (I think it's
>> during "git add" but because of the way I issued the commands I can't
>> tell: it could have been a commit or a gc).  This is on a 64-bit linux
>> machine with 8G of ram and plenty of swap space, so I'm surprised.
>> 
>> Git is doing an amazing job at archiving and compressing all this stuff
>> I'm putting in it, but I have to do it a wee bit at a time or it craps
>> out.  Bug?
>
> No, not really.  Above you said you are "abusing git for a purpose
> it wasn't intended to serve"...

Absolutely; I want to be upfront about that :-)

> Git was never designed to handle many large binary blobs of data.

They're largely text blobs, although there definitely are a fair share
of binaries.

> It was mostly designed for source code, where the majority of the
> data stored in it is some form of text file written by a human.
>
> By their very nature these files need to be relatively short (e.g.
> under 1 MB each) as no human can sanely maintain a text file that
> large without breaking it apart into different smaller files (like
> the source code for an operating system kernel).
>
> As a result of this approach, the git code assumes it can malloc()
> at least two blocks large enough for each file: one of the fully
> decompressed content, and another for the fully compressed content.
> Try doing git add on a large file and its very likely malloc
> will fail due to ulimit issues, or you just don't have enough
> memory/address space to go around.

Oh, so maybe I'm getting hit by ulimit; I didn't think of that.  I could
raise my ulimit to try to get around this.

> git gc likewise needs a good chunk of memory, but it shouldn't
> usually report "malloc failed".  Usually in git gc if a malloc fails
> it prints a warning and degrades the quality of its data compression.
> But there are critical bookkeeping data structures where we must be
> able to malloc the memory, and if those fail because we've already
> exhausted the heap early on, then yea, it can fail too.

Thanks much for that, and for reminding me about ulimit.

Cheers,

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux