On 12/11/07, Nicolas Pitre <nico@xxxxxxx> wrote: > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > And yet, this is still missing the actual issue. The issue being that > > the 2.1GB pack as a _source_ doesn't cause as much memory to be > > allocated even if the _result_ pack ends up being the same. > > > > I was able to repack the 2.1GB pack on my machine which has 1GB of ram. > > Now that it has been repacked, I can't repack it anymore, even when > > single threaded, as it start crowling into swap fairly quickly. It is > > really non intuitive and actually senseless that Git would require twice > > as much RAM to deal with a pack that is 7 times smaller. > > OK, here's something else for you to try: > > core.deltabasecachelimit=0 > pack.threads=2 > pack.deltacachesize=1 > > With that I'm able to repack the small gcc pack on my machine with 1GB > of ram using: > > git repack -a -f -d --window=250 --depth=250 > > and top reports a ~700m virt and ~500m res without hitting swap at all. > It is only at 25% so far, but I was unable to get that far before. > > Would be curious to know what you get with 4 threads on your machine. Changing those parameters really slowed down counting the objects. I used to be able to count in 45 seconds now it took 130 seconds. I am still have the Google allocator linked in. 4 threads, cumulative clock time 25% 200 seconds, 820/627M 55% 510 seconds, 1240/1000M - little late recording 75% 15 minutes, 1658/1500M 90% 22 minutes, 1974/1800M it's still running but there is no significant change. Are two types of allocations being mixed? 1) long term, global objects kept until the end of everything 2) volatile, private objects allocated only while the object is being compressed and then freed Separating these would make a big difference to the fragmentation problem. Single threading probably wouldn't see a fragmentation problem from mixing the allocation types. When a thread is created it could allocated a private 20MB (or whatever) pool. The volatile, private objects would come from that pool. Long term objects would stay in the global pool. Since they are long term they will just get laid down sequentially in memory. Separating these allocation types make things way easier for malloc. CPU time would be helped by removing some of the locking if possible. -- Jon Smirl jonsmirl@xxxxxxxxx - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html