On Sat, 10 Nov 2007, bob wrote: > > The reason that I ask is that I have been playing different > scenarios using git 1.5.3.5 under MacOSX 10.4.10 mostly > all day and every time that > > A) a file approaches or exceeds 2gig on an 'add', it > results in: > fatal: Out of memory? mmap failed: Cannot allocate memory Git wants to handle single files as one single entity, so single big files really do end up being very painful. The costs of compressing them and generating deltas would probably get prohibitively high *anyway*, but it does mean that if you have gigabyte files, you do want a 64-bit VM. I thought OS X could do 64 bits these days. Maybe not. Anyway, that explains the "cannot allocate memory". Git simply wants to mmap the whole file. You don't have enough VM space for it. (And if you seriously want to work with multi-gigabyte files, git probbaly isn't going to perform wonderfully well, even if it *should* work fine if you just have a full 64-bit environment that allows the mmap). > B) the repository size less the .git subdirectory approaches > 4gig on a 'fetch' it results in: > > Resolving 3356 deltas... > fatal: serious inflate inconsistency: -3 (unknown compression method) That sounds really broken. I'm not seeing what would cause that, apart from some really bad data corruption and/or broken zlib implementation. But if the pack-file really is 2GB+ in size, I could imagine some sign issues cropping up. git will generally use "unsigned long" (which is probably just 32-bit on your setup), but since git in those circumstances would be limited by the size of the VM _anyway_, that's not really much of a limitation (although probably broken on the crazy Windows "LLP64" model). But maybe we have some place where we use a signed thing, or zlib does, and I could see that causing breakage. But that code-sequence really should never even come *close* to the 31-bit limit, as long as the individual objects themselves aren't bigger than the available VM space (and git currently assumes "unsigned long" is sufficiently big to cover the VM space, which is not technically correct, but should be fine on OS X too). That said, we should use "off_t" in that function. I suspect we have a number of people (read: me) who have grown too used to living in a 64-bit world.. > I have been testing on my laptop which has a 32-bit Intel Core Duo. Ok, so you're 32-bit limited even if there is were to be some 64-bit support for OS X. > Also, I have run the same tests on a dual quad-core Intel processor > which is 64 bit, (but not sure that Apple uses the 64 bits in 10.4.10). I > get the same results as above. I'm pretty sure OS X defaults to a 32-bit environment, but has at least *some* 64-bit support. It would definitely need to be enabled explicitly (since they made the *insane* decision to move over to Intel laptop chips six months before they got 64-bit support! Somebody at Apple is a total idiot, and should get fired). So it would be interesting to hear if a 64-bit build would make a difference. > The zlib is at the latest revision of 1.2.3 and gcc is at 4.0.1 > which from what I can tell supports large files, because 'off_t' is 8 bytes > which is the size used for a 'stat' file size. See above: single files are size-limited, but with a large off_t like yours, you should be fine. Except we may have screwed up. > I am just wondering if these size limitations exist for MacOSX > or maybe I am doing something wrong (which is probably > the case). We *have* had issues with broken implementations of "pread()" on some systems. You could try setting NO_PREAD in the Makefile and compiling with the compatibility function.. That's the only thing that comes to mind as being worth trying in that area. And if you have some script to generate the repository (ie you aren't using "live data", but are testing the limits of the system), if you can make that available, so that people with non-OSX environments can test, that would be interesting.. I certainly have some 32-bit environments too (old linux boxes), but I'm too lazy to write a test-case, so I was hoping you'd be using some simple scripts that I could just test and see if I can see the behaviour you describe myself. That said, I have worked with a 3GB pack-file (one of the KDE trial repos). That worked fine. But git does tend to want a *lot* of memory for really big repositories, so I suspect that if you actually work with 2GB+ pack-files, you'll be wanting a 64-bit environment just because you'll be wanting more than 2GB of physical RAM in order to be able to access it efficiently. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html