Hei Chan <structurechart@xxxxxxxxx> writes: > You mentioned the statement "is a compiler scheduling barrier for all > expressions that load from or store values to memory". Does "memory" mean the > main memory? Or does it include the CPU cache? I tried to explain what I meant by way of example. It means pointer reference, array reference, volatile variable access. Also I should have added global variable access. In general it means memory from the point of view of the compiler. The compiler doesn't know anything about the CPU cache. When thinking about a "compiler scheduling barrier," you have to think about the world that the compiler sees, which is quite different from, though obviously related to, the world that the hardware sees. Ian > ----- Original Message ---- > From: Ian Lance Taylor <iant@xxxxxxxxxx> > To: Hei Chan <structurechart@xxxxxxxxx> > Cc: gcc-help@xxxxxxxxxxx > Sent: Mon, April 11, 2011 2:42:07 PM > Subject: Re: full memory barrier? > > Hei Chan <structurechart@xxxxxxxxx> writes: > >> I am a little bit confused what asm volatile ("" : : : "memory") does. >> >> I searched online; many people said that it creates the "full memory barrier". >> >> I have a test code: >> int main() { >> bool bar; >> asm volatile ("" : : : "memory"); >> bar = true; >> return 1; >> } >> >> Running g++ -c -g -Wa,-a,-ad foo.cpp gives me: >> >> 2:foo.cpp **** bool bar; >> 3:foo.cpp **** asm volatile ("" : : : "memory"); >> 22 .loc 1 3 0 >> 4:foo.cpp **** bar = true; >> 23 .loc 1 4 0 >> >> It doesn't involve any fence instruction. >> >> Maybe I completely misunderstand the idea of "full memory barrier". > > The definition of "memory barrier" is ambiguous when looking at code > written in a high-level language. > > The statement "asm volatile ("" : : : "memory");" is a compiler > scheduling barrier for all expressions that load from or store values to > memory. That means something like a pointer dereference, an array > index, or an access to a volatile variable. It may or may not include a > reference to a local variable, as a local variable need not be in > memory. > > This kind of compiler scheduling barrier can be used in conjunction with > a hardware memory barrier. The compiler doesn't know that a hardware > memory barrier is special, and it will happily move memory access > instructions across the hardware barrier. Therefore, if you want to use > a hardware memory barrier in compiled code, you must use it along with a > compiler scheduling barrier. > > On the other hand a compiler scheduling barrier can be useful even > without a hardware memory barrier. For example, in a coroutine based > system with multiple light-weight threads running on a single processor, > you need a compiler scheduling barrier, but you do not need a hardware > memory barrier. > > gcc will generate a hardware memory barrier if you use the > __sync_synchronize builtin function. That function acts as both a > hardware memory barrier and a compiler scheduling barrier. > > Ian