On Fri, 2007-12-07 at 20:46 -0500, Nicolas Pitre wrote: > On Fri, 7 Dec 2007, Jon Smirl wrote: > > And the 330MB gcc pack for input > > git repack -a -d -f --depth=250 --window=250 > > > > complete seconds RAM > > 10% 47 1GB > > 20% 29 1Gb > > 30% 24 1Gb > > 40% 18 1GB > > 50% 110 1.2GB > > 60% 85 1.4GB > > 70% 195 1.5GB > > 80% 186 2.5GB > > 90% 489 3.8GB > > 95% 800 4.8GB > > I killed it because it started swapping > > > > The mmaps are only about 400MB in this case. > > At the end the git process had 4.4GB of physical RAM allocated. > > Starting with a 2GB pack of the same data my process size only grew to > > 3GB with 2GB of mmaps. > > Which is quite reasonable, even if the same issue might still be there. > > So the problem seems to be related to the pack access code and not the > repack code. And it must have something to do with the number of deltas > being replayed. And because the repack is attempting delta compression > roughly from newest to oldest, and because old objects are typically in > a deeper delta chain, then this might explain the logarithmic slowdown. > > So something must be wrong with the delta cache in sha1_file.c somehow. All I have is a qualitative observation, but during the process of creating the pack, there was a _huge_ slowdown between 10-15% (hundreds/dozens per second to single object per second and a corresponding increase in process size). Didn't keep any numbers at the time, but it was noticable. I wonder if there are a bunch of huge objects somewhere in gcc's history? Harvey - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html