On Fri, Jan 18, 2019 at 8:04 PM Patrick Hogg <phogg@xxxxxxxxxxxx> wrote: > > On Fri, Jan 18, 2019 at 4:21 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote: >> >> On Fri, Jan 18, 2019 at 9:28 AM Patrick Hogg <phogg@xxxxxxxxxxxx> wrote: >> > >> > ac77d0c37 ("pack-objects: shrink size field in struct object_entry", >> > 2018-04-14) added an extra usage of read_lock/read_unlock in the newly >> > introduced oe_get_size_slow for thread safety in parallel calls to >> > try_delta(). Unfortunately oe_get_size_slow is also used in serial >> > code, some of which is called before the first invocation of >> > ll_find_deltas. As such the read mutex is not guaranteed to be >> > initialized. >> >> This must be the SIZE() macros in type_size_sort(), isn't it? I think >> we hit the same problem (use of uninitialized mutex) in this same code >> not long ago. I wonder if there's anyway we can reliably test and >> catch this. > > > It was actually the SET_SIZE macro in check_object, at least for the repo at my company that hits this issue. I took a look at the call tree for oe_get_size_slow and found that it's used in many places outside of ll_find_deltas, so there are many potential call sites where this could crop up: > > [snip] > Ah, yes. I think the only problematic place is from prepare_pack(). The single threaded access after ll_find_deltas() is fine because we never destroy mutexes. > (Sorry if this is redundant for those who know the code better) Actually it's me to say sorry. I apparently did not know the code flow good enough to prevent this problem in the first place. >> > Resolve this by splitting off the read mutex initialization from >> > init_threaded_search. Instead initialize (and clean up) the read >> > mutex in cmd_pack_objects. >> >> Maybe move the mutex to 'struct packing_data' and initialize it in >> prepare_packing_data(), so we centralize mutex at two locations: >> generic ones go there, command-specific mutexes stay here in >> init_threaded_search(). We could also move oe_get_size_slow() back to >> pack-objects.c (the one outside builtin/). > > > I was already thinking that generic mutexes should be separated from command specific ones (that's why I introduced init_read_mutex and cleanup_read_mutex, but that may well not be the right exposure.) I'll try my hand at this tonight (just moving the mutex to struct packing_data and initializing it in prepare_packing_data, I'll leave large code moves to the experts) and see how it turns out. Yes, leave the code move for now. Bug fixes stay small and simple (and get merged faster) -- Duy