Re: Question about __builtin_ia32_mfence and memory barriers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> A better choice these days is __atomic_thread_fence(__ATOMIC_SEQ_CST)
> (or __atomic_signal_fence).

This sounded so promising. Unfortunately, it's not producing the results I need. I can put all these statements in the code, and none of them generate -any- fence instruction:

    __atomic_thread_fence(__ATOMIC_RELAXED);
    __atomic_thread_fence(__ATOMIC_CONSUME);
    __atomic_thread_fence(__ATOMIC_ACQUIRE);
    __atomic_thread_fence(__ATOMIC_RELEASE);
    __atomic_thread_fence(__ATOMIC_ACQ_REL);

    __atomic_signal_fence(__ATOMIC_RELAXED);
    __atomic_signal_fence(__ATOMIC_CONSUME);
    __atomic_signal_fence(__ATOMIC_ACQUIRE);
    __atomic_signal_fence(__ATOMIC_RELEASE);
    __atomic_signal_fence(__ATOMIC_ACQ_REL);
    __atomic_signal_fence(__ATOMIC_SEQ_CST);

And while I get an mfence instruction with this:

    __atomic_thread_fence(__ATOMIC_SEQ_CST);

It doesn't produce quite the same instruction ordering as:

  asm volatile ("mfence" ::: "memory");

Which makes me think that whatever __ATOMIC_SEQ_CST means, it's not the same as the "memory" clobber. Also, I'm looking to support SFENCE and LFENCE, which these don't appear to support at all.

> I'm not clear on whether _mm_mfence is meant to be a compiler memory barrier or not.

Every authoritative reference I have found is maddeningly silent on this point.

However, I have tried compiling x64 code with MSVC, and the instruction ordering it produces for _mm_mfence is not the same as what it produces for _mm_sfence. In fact, the asm produced when using _mm_sfence bears a striking similarity to what you get with just _WriteBarrier (minus the sfence instruction, of course), and _mm_mfence looks like _ReadWriteBarrier.

While I'm not prepared to call this conclusive evidence, it is becoming suspicious.

And apparently I'm not the only person who thinks there is a problem here (http://doxygen.reactos.org/dd/dcb/intrin__x86_8h_a0dee6d755a43d9f9d8072d6202b487db.html#a0dee6d755a43d9f9d8072d6202b487db). I was concerned about using 2 statements and hoping the compiler didn't re-order any code around them. I'm not convinced that 3 statements makes me feel any better.

dw




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux