On 12/6/07, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Thu, 6 Dec 2007, Jeff King wrote: > > > > What is really disappointing is that we saved only about 20% of the > > time. I didn't sit around watching the stages, but my guess is that we > > spent a long time in the single threaded "writing objects" stage with a > > thrashing delta cache. > > I don't think you spent all that much time writing the objects. That part > isn't very intensive, it's mostly about the IO. > > I suspect you may simply be dominated by memory-throughput issues. The > delta matching doesn't cache all that well, and using two or more cores > isn't going to help all that much if they are largely waiting for memory > (and quite possibly also perhaps fighting each other for a shared cache? > Is this a Core 2 with the shared L2?) When I lasted looked at the code, the problem was in evenly dividing the work. I was using a four core machine and most of the time one core would end up with 3-5x the work of the lightest loaded core. Setting pack.threads up to 20 fixed the problem. With a high number of threads I was able to get a 4hr pack to finished in something like 1:15. A scheme where each core could work a minute without communicating to the other cores would be best. It would also be more efficient if the cores could avoid having sync points between them. -- Jon Smirl jonsmirl@xxxxxxxxx - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html