On Fri, Jul 20, 2018 at 7:28 AM Jeff King <peff@xxxxxxxx> wrote: > > On Thu, Jul 19, 2018 at 04:11:01PM -0700, Elijah Newren wrote: > > > Looking at the output from Peff's instrumentation elsewhere in this > > thread, I see a lot of lines like > > mismatched get: 32889efd307c7be376da9e3d45a78305f14ba73a = (, 28) > > Does that mean it was reading the array when it wasn't ready? > > Yes, it looks like we saw a "get" without a "set". Though this could > also be due to threading. The tracing isn't atomic with respect to the > actual get/set operation, so it's possible that the ordering of the > trace output does not match the ordering of the actual operations. > > > However, it's interesting to also look at the effect on packing > > linux.git (on the same beefy hardware): > > > > Version Pack (MB) MaxRSS(kB) Time (s) > > ------- --------- ---------- -------- > > 2.17.0 1279 11382932 632.24 > > 2.18.0 1279 10817568 621.97 > > fiv-v4 1279 11484168 1193.67 > > > > While the pack size is nice and small, the original memory savings > > added in 2.18.0 are gone and the performance is much worse. :-( > > Interesting. I can't reproduce here. The fix-v4 case is only slightly > slower than 2.18.0. Can you double check that your compiler flags, etc, > were the same? Many times I've accidentally compared -O0 to -O0. :) He ran 40 threads though. That number of threads can make lock contention very expensive. Yeah my money is also on lock contention. > You might also try the patch below (on top of fix-v4), which moves the Another thing Elijah could try is watch CPU utilization. If this is lock contention, I think core utilization should be much lower because we spend more time waiting than actually doing things. > locking to its own dedicated mutex. That should reduce lock contention, I think we could use cache_lock() which is for non-odb shared data (and delta_size[] fits this category) > and it fixes the remaining realloc where I think we're still racy. On my Yeah it's not truly racy as you also noted in another mail. I'll make a note about this in the commit message. > repack of linux.git, it dropped the runtime from 6m3s to 5m41s, almost > entirely in system CPU. -- Duy