Jeff King <peff@xxxxxxxx> writes: > On Fri, Mar 15, 2013 at 03:42:40PM -0700, Stefan Zager wrote: > >> We have uncovered a regression in this commit: >> >> b8a2486f1524947f232f657e9f2ebf44e3e7a243 >> >> The symptom is that 'git fetch' dies with: >> >> error: index-pack died of signal 10 >> fatal: index-pack failed >> >> I have only been able to reproduce it on a Mac thus far; will try >> ubuntu next. We can make it go away by running: >> >> git config pack.threads 1 > > I couldn't reproduce the problem on Linux with the instructions you > gave. I did try running it under valgrind and it produced: > > ==2380== Conditional jump or move depends on uninitialised value(s) > ==2380== at 0x441631: resolve_delta (index-pack.c:837) > ==2380== by 0x4419D6: find_unresolved_deltas_1 (index-pack.c:898) > ==2380== by 0x441A45: find_unresolved_deltas (index-pack.c:914) > ==2380== by 0x4427CA: fix_unresolved_deltas (index-pack.c:1232) > ==2380== by 0x4421F5: conclude_pack (index-pack.c:1111) > ==2380== by 0x443A5C: cmd_index_pack (index-pack.c:1604) > ==2380== by 0x4058A2: run_builtin (git.c:281) > ==2380== by 0x405A35: handle_internal_command (git.c:443) > ==2380== by 0x405C01: main (git.c:532) > > but the line in question is: > > if (deepest_delta < delta_obj->delta_depth) > > And in the debugger, both of those variables appear to have sane values > (nor should either impacted by the patch you bisected to). On top of > that, running with pack.threads=1 produces the same error. So I think it > may be a false positive from valgrind, and unrelated to your issue. I find that somewhat unlikely, for two reasons: memcheck is actually quite good at finding uninitialized memory use, it just isn't that good at distinguishing if it makes a difference. Most false positives are of the "loading an entire word and discarding most of it" kind. Second, the thread-debugging valgrind tools (drd and helgrind) also complain about exactly this line: DRD says: ==20987== Thread 4: ==20987== Conflicting load by thread 4 at 0x007a70d0 size 4 ==20987== at 0x43A783: resolve_delta (index-pack.c:837) ==20987== by 0x43A94F: find_unresolved_deltas (index-pack.c:898) ==20987== by 0x43B0F8: threaded_second_pass (index-pack.c:945) ==20987== by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341) ==20987== by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so) ==20987== Allocation context: BSS section of /home/thomas/.local/bin/git ==20987== Other segment start (thread 2) ==20987== at 0x4C30A1F: pthread_mutex_unlock (drd_pthread_intercepts.c:665) ==20987== by 0x43B06E: threaded_second_pass (index-pack.c:122) ==20987== by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341) ==20987== by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so) ==20987== Other segment end (thread 2) ==20987== at 0x5436AE3: ??? (in /lib64/libpthread-2.15.so) ==20987== by 0x439C76: unpack_data.constprop.8 (index-pack.c:528) ==20987== by 0x439EA7: get_base_data (index-pack.c:571) ==20987== by 0x43A7B4: resolve_delta (index-pack.c:841) ==20987== by 0x43A94F: find_unresolved_deltas (index-pack.c:898) ==20987== by 0x43B0F8: threaded_second_pass (index-pack.c:945) ==20987== by 0x4C2CF60: vgDrd_thread_wrapper (drd_pthread_intercepts.c:341) ==20987== by 0x542FE0D: start_thread (in /lib64/libpthread-2.15.so) helgrind says: ==21160== Possible data race during read of size 4 at 0x7A70D0 by thread #3 ==21160== Locks held: none ==21160== at 0x43A783: resolve_delta (index-pack.c:837) ==21160== by 0x43A94F: find_unresolved_deltas (index-pack.c:898) ==21160== by 0x43B0F8: threaded_second_pass (index-pack.c:945) ==21160== by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219) ==21160== by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so) ==21160== ==21160== This conflicts with a previous write of size 4 by thread #2 ==21160== Locks held: none ==21160== at 0x43A78E: resolve_delta (index-pack.c:838) ==21160== by 0x43A94F: find_unresolved_deltas (index-pack.c:898) ==21160== by 0x43B0F8: threaded_second_pass (index-pack.c:945) ==21160== by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219) ==21160== by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so) You were apparently just lucky in catching it before it was even initialized. I can reproduce the above warnings with valgrind --tool=helgrind --trace-children=yes git index-pack <any_pack> i.e., it does not seem to depend on the pack (sample size 3, packs looked at were from my git.git). Duy, what was the reasoning why resolve_delta() does not need to hold locks when it looks when it looks at deepest_delta? My coffee levels aren't up to this task yet. It certainly seems extremely dubious to me, as the code uses the global deepest_delta in threaded sections. You can probably argue that the load/store is atomic on most(?) platforms, but you get no guarantees that deepest_delta at any time in fact holds the maximum value of delta_obj->delta_depth. Furthermore there's another warning shown by helgrind: ==21160== Possible data race during write of size 4 at 0x7A7060 by thread #2 ==21160== Locks held: 1, at address 0x7A70E0 ==21160== at 0x43A840: resolve_delta (index-pack.c:853) ==21160== by 0x43A94F: find_unresolved_deltas (index-pack.c:898) ==21160== by 0x43B0F8: threaded_second_pass (index-pack.c:945) ==21160== by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219) ==21160== by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so) ==21160== ==21160== This conflicts with a previous read of size 4 by thread #3 ==21160== Locks held: 1, at address 0x7A7080 ==21160== at 0x43AFC8: threaded_second_pass (index-pack.c:955) ==21160== by 0x4C2D35D: mythread_wrapper (hg_intercepts.c:219) ==21160== by 0x5424E0D: start_thread (in /lib64/libpthread-2.15.so) That one really seems to be a false positive in the sense that threaded_second_pass() doesn't really care if it gets a bad value for nr_resolved_deltas. The only thing that matters is that the counter increment is done with counter_mutex held. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html