Junio, Nico, I think we need to do something about it. CLee was complaining about git-index-pack on #irc with the partial KDE repo, and while I don't have the KDE repo, I decided to investigate a bit. Even with just the kernel repo (with a single 170MB pack-file), I can do git index-pack --stdin --fix-thin new.pack < .git/objects/pack/pack-*.pack and it uses 52s of CPU-time, and on my 4GB machine it actually started doing IO and swapping, because git-index-pack grew to 4.8GB in size. So while I initially thought I'd want a bigger test-case to see the problem, I sure as heck don't. The 52s of CPU time exploded into almost three minutes of actual real-time: 47.33user 5.79system 2:41.65elapsed 32%CPU 2117major+1245763minor And that's on a good system with a powerful CPU, "enough memory" for any reasonable development, and good disks! Very much ungood-plus-plus. I haven't looked into exactly why yet, but I bet it's just that we keep every single object expanded in memory. We do need to keep the objects around, so that we can resolve delta's, but we can certainly do it other ways. Two suggestion for other ways: - simple one: don't keep unexploded objects around, just keep the deltas, and spend tons of CPU-time just re-expanding them if required. We *should* be able to do it with just keeping the original 170MB pack-file in memory, not expanding it to 3.8GB! Still, even this will be painful once you have a big pack-file, and the CPU waste is nasty (although a delta-base cache like we do in sha1_file.c would probably fix it 99% - at that point it's getting less simple, and the "best" solution below looks more palatable) - best one: when writing out the pack-file, we incrementally keep a "struct packed_git" around, and update the index for it dynamically, and totally get rid of all objects that we've written out, because we can re-create them. This means that we should have _zero_ memory footprint except for the one object that we're working on right then and there, and any unresolved deltas where we've not seen the base at all (and the latter generally shouldn't happen any more with most pack-files) The "best one" wouldn't seem to be *that* painful, but as mentioned, I haven't even started looking at the code yet, I thought I'd try to rope Nico into looking at this first ;) Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html