Thanks for the explanation. :) Kyle Moffett <kyle@xxxxxxxxxxxxxxx> wrote: > On Thu, Oct 27, 2011 at 19:00, Atsushi Nakagawa <atnak@xxxxxxxxx> wrote: > > Erik Faye-Lund <kusmabite@xxxxxxxxx> wrote: > >> On Wed, Oct 26, 2011 at 5:44 AM, Kyle Moffett <kyle@xxxxxxxxxxxxxxx> wrote: > >> > On Tue, Oct 25, 2011 at 16:51, Erik Faye-Lund <kusmabite@xxxxxxxxx> wrote: > >> >> [...] > >> > > >> > No, I'm afraid that won't solve the issue (at least in GCC, not sure about MSVC) > >> > > >> > A write barrier in one thread is only effective if it is paired with a > >> > read barrier in the other thread. > >> > > >> > Since there's no read barrier in the "while(mutex->autoinit != 0)", > >> > you don't have any guaranteed ordering. > > > > Out of curiosity, where could re-ordering be a problem here? I'm > > thinking probably at "EnterCriticalSection(&mutex->cs)" and the contents > > of "mutex->cs" not being propagated to the waiting thread. However, > > shouldn't that be a non-problem, as far as compiler reordering goes, > > because it's an external function call and only the address of mutex->cs > > is passed? > > > > [...] > > Ok, so here's the race condition: > > Thread1 Thread2 > /* Speculative prefetch */ > prefetch(*mutex); > > if (mutex->autoinit) { > if (ICE(&mutex->autoinit, -1, 1) != -1) { > /* Now mutex->autoinit == -1 */ > pthread_mutex_init(mutex, NULL); > /* This forces writes out to memory */ > ICE(&mutex->autoinit, 0, -1); > > if (mutex->autoinit) {} /* false */ > /* No read barrier here */ > EnterCriticalSection(&mutex->cs); > /* Use cached mutex->cs from earlier */ Ok, so there's no way of skimping on that one memory barrier in every visit to pthread_mutex_lock(). Interesting. Makes me wonder how it trades off to lazy initialization. > > Even though you forced the memory write to be ordered in Thread 1 you > did not ensure that the read of autoinit occurred before the read of > mutex->cs in Thread 2. If the first thing that EnterCriticalSection > does is follow a pointer or read a mutex key value in mutex->cs then > won't necessarily get the right answer. > > The rule of memory barriers is the ALWAYS come in pairs. This simple > example guarantees that Thread2 will read "tmp_a" == 1 if it > previously read "tmp_b" == 1, although getting "tmp_a" == 1 and > "tmp_b" != 1 is still possible. > > Thread1: > a = 1; > write_barrier(); > b = 1; > > Thread2: > tmp_b = b; > read_barrier(); > tmp_a = a; > > I think there's a Documentation/memory-barriers.txt file in the kernel > source code with more helpful info. -- Atsushi Nakagawa <atnak@xxxxxxxxx> Changes are made when there is inconvenience. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html