Duy, Patrick, On Tue, Jan 22, 2019 at 9:52 AM Elijah Newren <newren@xxxxxxxxx> wrote: > > On Mon, Jan 21, 2019 at 2:02 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote: > > > > On Sat, Jan 19, 2019 at 10:45 PM Patrick Hogg <phogg@xxxxxxxxxxxx> wrote: > > > > > > ac77d0c37 ("pack-objects: shrink size field in struct object_entry", > > > 2018-04-14) added an extra usage of read_lock/read_unlock in the newly > > > introduced oe_get_size_slow for thread safety in parallel calls to > > > try_delta(). Unfortunately oe_get_size_slow is also used in serial > > > code, some of which is called before the first invocation of > > > ll_find_deltas. As such the read mutex is not guaranteed to be > > > initialized. > > > > > > Resolve this by using the existing lock in packing_data which is > > > initialized early in cmd_pack_objects instead of read_mutex. > > > Additionally, upgrade the packing_data lock to a recursive mutex to > > > make it a suitable replacement for read_mutex. > > > > > > Signed-off-by: Patrick Hogg <phogg@xxxxxxxxxxxx> > > > --- > > > > > > As I mentioned in the prior thread I think that it will be simpler > > > to simply use the existing lock in packing_data instead of moving > > > read_mutex. I can go back to simply moving read_mutex to the > > > packing_data struct if that that is preferable, though. > > > > In early iterations of these changes, I think we hit high contention > > when sharing the mutex [1]. I don't know if we will hit the same > > performance problem again with this patch. It would be great if Elijah > > with his zillion core machine could test this out. Otherwise it may be > > just safer to keep the two mutexes separate. > Before this patch, repacking an old clone of linux.git on a 40-core box (an m4.10xlarge) using the command /usr/bin/time -f 'MaxRSS:%M Time:%e' git gc --aggressive based on commit 16a465bc01 ("Third batch after 2.20", 2019-01-18), resulted in: MaxRSS:10959072 Time:650.63 Since that was a cold cache, might have had loose objects and what not, I threw that one out and ran three more times: MaxRSS: 9886284 Time:571.40 MaxRSS:10441716 Time:566.47 MaxRSS:10157536 Time:569.79 Then I applied this patch and ran three more times: MaxRSS:10611444 Time:574.60 MaxRSS:10429732 Time:571.48 MaxRSS:10657160 Time:575.58 (Yeah, we talked about MaxRSS not being the most useful last time, but I'm just copying the command I used last time for consistency; I'm only really paying attention to Time.) So, it's within 1% of the timing of the current version. Seems fine to me. Hope that helps, Elijah