Re: [msysGit] Re: [PATCH/RFC] mingw: implement PTHREAD_MUTEX_INITIALIZER

Erik Faye-Lund <kusmabite@xxxxxxxxx> · Tue, 25 Oct 2011 22:51:19 +0200

On Tue, Oct 25, 2011 at 10:07 PM, Johannes Sixt <j6t@xxxxxxxx> wrote:
> Am 25.10.2011 17:42, schrieb Erik Faye-Lund:
>> On Tue, Oct 25, 2011 at 5:28 PM, Johannes Sixt <j.sixt@xxxxxxxxxxxxx> wrote:
>>> Am 10/25/2011 16:55, schrieb Erik Faye-Lund:
>>>> +int pthread_mutex_lock(pthread_mutex_t *mutex)
>>>> +{
>>>> +     if (mutex->autoinit) {
>>>> +             if (InterlockedCompareExchange(&mutex->autoinit, -1, 1) != -1) {
>>>> +                     pthread_mutex_init(mutex, NULL);
>>>> +                     mutex->autoinit = 0;
>>>> +             } else
>>>> +                     while (mutex->autoinit != 0)
>>>> +                             ; /* wait for other thread */
>>>> +     }
>>>
>>> The double-checked locking idiom. Very suspicious. Can you explain why it
>>> works in this case? Why are no Interlocked functions needed for the other
>>> accesses of autoinit? ("It is volatile" is the wrong answer to this last
>>> question, BTW.)
>>
>> I agree that it should look a bit suspicious; I'm generally skeptical
>> whenever I see 'volatile' in threading-code myself. But I think it's
>> the right answer in this case. "volatile" means that the compiler
>> cannot optimize away accesses, which is sufficient in this case.
>
> No, it is not, and it took me a train ride to see what's wrong. It has
> nothing to do with autoinit, but with all the other memory locations
> that are written. See here, with pthread_mutex_init() inlined:
>
>  if (mutex->autoinit) {
>
> Assume two threads enter this block.
>
>     if (InterlockedCompareExchange(&mutex->autoinit, -1, 1) != -1) {
>
> Only one thread, A, say on CPU A, will enter this block.
>
>        InitializeCriticalSection(&mutex->cs);
>
> Thread A writes some values. Note that there are no memory barriers
> involved here. Not that I know of or that they would be documented.
>
>        mutex->autoinit = 0;
>
> And it writes another one. Thread A continues below to contend for the
> mutex it just initialized.
>
>     } else
>
> Meanwhile, thread B, say on CPU B, spins in this loop:
>
>        while (mutex->autoinit != 0)
>           ; /* wait for other thread */
>
> When thread B arrives here, it sees the value of autoinit that thread A
> has written above.
>
> HOWEVER, when it continues, there is NO [*] guarantee that it will also
> see the values that InitializeCriticalSection() has written, because
> there were no memory barriers involved. When it continues, there is a
> chance that it calls EnterCriticalSection() with uninitialized values!
>

Thanks for pointing this out, I completely forgot about write re-ordering.

This is indeed a problem. So, shouldn't replacing "mutex->autoinit =
0;" with "InterlockedExchange(&mutex->autoinit, 0)" solve the problem?
InterlockedExchange generates a full memory barrier:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms683590(v=vs.85).aspx

>  }
>
>
> [*] If you compile this code with MSVC >= 2005, "No guarantee" is not
> true, it's exactly the opposite because Microsoft extended the meaning
> of 'volatile' to imply a memory barriere.

Do you have a source for this? I'm not saying it isn't true, I just
never heard of this, and would like to read up on it :)

> This is *NOT* true for gcc in
> general. It may be true for MinGW gcc, but I do not know.
>
>> Basically, the thread that gets the original 1 returned from
>> InterlockedCompareExchange is the only one who writes to
>> mutex->autoinit. All other threads only read the value, and the
>> volatile should make sure they actually do. Since all 32-bit reads and
>> writes are atomic on Windows (see
>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms684122(v=vs.85).aspx
>> "Simple reads and writes to properly-aligned 32-bit variables are
>> atomic operations.") and mutex->autoinit is a LONG, this should be
>> safe AFAICT. In fact, Windows specifically does not have any
>> explicitly atomic writes exactly for this reason.
>
> There is a difference between atomic and coherent: Yes, 32-bit accesses
> are atomic, but they are not automatically coherent: A 32-bit value
> written by one CPU is not instantly visible on the other CPU. 'volatile'
> as per the C lanugage does not add any guarantees that would be of
> interest here. OTOH, Microsoft's definition of 'volatile' does.
>

I never meant to imply this either, I simply forgot about write re-ordering ;)

>> The only ways mutex->autoinit can be updated is:
>> - InterlockedCompareExchange compares it to 1, finds it's identical
>> and inserts -1
>> - intialization is done
>> Both these updates happens from the same thread.
>>
>> Yes, details like this should probably go into the commit message ;)
>
> A comment in the function is preferred!
>

Yes, good point. This is code that looks dangerous (and in this case
potentially was - hopefully eventual future iteration won't be), and
should of course be documented inline...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html