I recently looked at git's condvar implementation for use in another project, and found a couple of simple optimization opportunities. We can drop the waiters_lock, and we can make broadcast asynchronous if it is waking up only one thread. I made two simple tests: - 1 thread sending a broadcast every 10 msec, 10 threads calling pthread_cond_wait every 100 msec. Timings are (average of three runs): before 2.094 us/wait 4.015 us/broadcast after 2.064 us/wait 2.883 us/broadcast i.e. most broadcasts and waits hit the fast path, the few that don't likely avoid the rendez-vous after the patch. In this case the waiters_lock is always hitting its own fast path. The speedup is mostly in broadcast, and comes mostly from the second patch. - 1 thread sending a broadcast every 100 msec, 10 threads calling pthread_cond_wait every 10 msec. Timings are: before 17.59 us/wait 192.2 us/broadcast after 8.959 us/wait 141.1 us/broadcast i.e. most broadcasts hit the slow path, and there will be also high contention on waiters_lock after the broadcast. In this case the speedup comes from avoiding locks in the first patch. I have tested this patch quite thoroughly outside git, but not as part of it. So help with that would be appreciated. Thanks, Paolo Paolo Bonzini (2): win32: optimize condition variable implementation win32: optimize pthread_cond_broadcast compat/win32/pthread.c | 71 +++++++++++++++++++++++++---------------------- compat/win32/pthread.h | 1 - 2 files changed, 38 insertions(+), 34 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html