On Sun, Aug 25, 2013 at 11:50:01AM -0400, Pete Wyckoff wrote: > Modern git, including your version, do "streaming" reads from p4, > so the git-p4 python process never even holds a whole file's > worth of data. You're seeing git-fast-import die, it seems. It > will hold onto the entire file contents. But just one, not the > entire repo. How big is the single largest file? > > You can import in pieces. See the change numbers like this: > > p4 changes -m 1000 //depot/big/... > p4 changes -m 1000 //depot/big/...@<some-old-change> > > Import something far enough back in history so that it seems > to work: > > git p4 clone --destination=big //depot/big@60602 > cd big > > Sync up a bit at a time: > > git p4 sync @60700 > git p4 sync @60800 > ... > > I don't expect this to get around the problem you describe, > however. Sounds like there is one gigantic file that is causing > git-fast-import to fill all of memory. You will at least isolate > the change. > > There are options to git-fast-import to limit max pack size > and to cause it to skip importing files that are too big, if > that would help. > > You can also use a client spec to hide the offending files > from git. > > Can you watch with "top"? Hit "M" to sort by memory usage, and > see how big the processes get before falling over. > > -- Pete You are correct that git-fast-import is killed by the OOM killer, but I was unclear about which process was malloc()ing so much memory that the OOM killer got invoked (as other completely unrelated processes usually also get killed when this happens). Unless there's one gigantic file in one change that gets removed by another change, I don't think that's the problem; as I mentioned in another email, the machine has 32GB physical memory and the largest single file in the current head is only 118MB. Even if there is a very large transient file somewhere in the history, I seriously doubt it's tens of gigabytes in size. I have tried watching it with top before, but it takes several hours before it dies. I haven't been able to see any explosion of memory usage, even within the final hour, but I've never caught it just before it dies, either. I suspect that whatever the issue is here, it happens very quickly. If I'm unable to get through this today using the incremental p4 sync method you described, I'll try running a full-blown clone overnight with top in batch mode writing to a log file to see whether it catches anything. Thanks again, Corey -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html