On Tue, Jan 27, 2009 at 10:04:42AM -0500, David Abrahams wrote: > I've been abusing Git for a purpose it wasn't intended to serve: > archiving a large number of files with many duplicates and > near-duplicates. Every once in a while, when trying to do something > really big, it tells me "malloc failed" and bails out (I think it's > during "git add" but because of the way I issued the commands I can't > tell: it could have been a commit or a gc). This is on a 64-bit linux > machine with 8G of ram and plenty of swap space, so I'm surprised. > > Git is doing an amazing job at archiving and compressing all this stuff > I'm putting in it, but I have to do it a wee bit at a time or it craps > out. Bug? How big is the repository? How big are the biggest files? I have a 3.5G repo with files ranging from a few bytes to about 180M. I've never run into malloc problems or gone into swap on my measly 1G box. How does your dataset compare? As others have mentioned, git wasn't really designed specifically for those sorts of numbers, but in the interests of performance, I find git is usually pretty careful about not keeping too much useless stuff in memory at one time. And the fact that you can perform the same operation a little bit at a time and achieve success implies to me there might be a leak or some silly behavior that can be fixed. It would help a lot if we knew the operation that was causing the problem. Can you try to isolate the failed command next time it happens? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html