On Sun, Jul 22, 2018 at 8:22 AM Elijah Newren <newren@xxxxxxxxx> wrote: > > On Fri, Jul 20, 2018 at 9:47 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > > On Fri, Jul 20, 2018 at 10:43:25AM -0700, Elijah Newren wrote: > >> Out of curiosity, would it be possible to use the delta_size_ field > >> for deltas that are small enough, and only use an external data > >> structure (perhaps a hash rather than an array) for the few deltas > >> that are large? > > > > We could. And because repack time is still a bit higher in your > > linux.git case. Let's try this. No locking in common case and very > > small locked region when we hit large deltas > > This one looks like a winner. Labelling this as fix-v7, this rounds > out the table to: > > Version Time (s) > ------- -------- > 2.17.0 621.36 > 2.18.0 621.80 > fix-v5 836.29 > fix-v6 831.73 > fix-v2 619.96 > fix-v7 622.88 > > So the runtime is basically within the noise of different runs of the > timing for 2.17.0 or 2.18.0 or -v2, and is much faster than -v5 or > -v6. Thanks. I'm looking forward to asking you to test lock-related changes on this 40-core monster in the future :D Unrelated point of improvement for the future. I notice that at least on my machine, i have 100% cpu on one core during writing phase, likely because deltas are being recomputed to be written down and we don't produce deltas fast enough. We should be able to take advantage of multiple cores to recompute deltas in advance at this stage and shorten pack-objects time some more. -- Duy