Re: [FYI] very large text files and their problems.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 24, 2012 at 06:14:46PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Fri, Feb 24, 2012 at 5:11 PM, Ian Kumlien <pomac@xxxxxxxxx> wrote:
> > I'm uncertain if you got my reply since i did it out of bounds - so i'll
> > repeat myself - sorry... =)
> 
> yes I received it, just too busy this week.

Ah good, you never know what anti-spam measures people applies these
days... =)

And i have the same, so i totally understand.

> >> > git needs to have atleast the same ammount of memory as the largest
> >> > file free... Couldn't this be worked around?
> >> >
> >> > On a (32 bit) machine with 4GB memory - results in:
> >> > fatal: Out of memory, malloc failed (tried to allocate 3310214313 bytes)
> >> >
> >> > (and i see how this could be a problem, but couldn't it be mitigated? or
> >> > is it bydesign and intended behaviour?)
> >>
> >> I think that it's delta resolving that hogs all your memory. If your
> >> files are smaller than 512M, try lower core.bigFileThreshold. The
> >> topic jc/split-blob, which stores a big file are several smaller
> >> pieces, might solve your problem. Unfortunately the topic is not
> >> complete yet.
> >
> > Well, in this case it's just stream unpacking gzip data to disk, i
> > understand if delta would be a problem... But wouldn't delta be a
> > problem in the sence of <size_of_change>+<size_of_subdata>+<result> ?
> >
> > Ie, if the file is mmapped - it shouldn't have to be allocated, right?
> 
> We should not delta large files. I was worried that the large file
> check could go wrong, But I guess your blob's not deltified in this
> case.

That would be correct

> When you receive a pack during a clone, the pack is streamed to
> index-pack, not mmapped, and index-pack checks every object in there
> in uncompressed form. I think I have found a way to avoid allocating
> that much. Need some more check, then send out.

Ah! That explains alot - do you have a publicly available version i
could look at?

> -- 
> Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]