On 16/03/2012 13:03, NightStrike wrote:
On Fri, Mar 16, 2012 at 12:21 AM, Ian Lance Taylor<iant@xxxxxxxxxx> wrote:
NightStrike<nightstrike@xxxxxxxxx> writes:
If I am interacting with shared memory, gcc doesn't know if another
process changes values. This creates issues for optimization. I know
that "volatile" can help address this, but it winds up causing a giant
mess of other problems (like, for instance, I can't pass a volatile
into memcpy.. or pretty much anything else from the standard
libraries).
No, volatile can not address this. This is not what the volatile
qualifier is for. The volatile qualifier is designed for working with
memory mapped hardware. It is not designed for multi-processor shared
memory. If a program is not multi-processor safe, then adding volatile
will never make it multi-processor safe.
Do you have to use volatile if you're writing to memory mapped
hardware, or just reading?
You do not /have/ to use volatile for writing or reading memory mapped
hardware, but it is usually a good idea.
When you use a volatile access, you are telling the compiler "do a read
or write here, exactly once - don't do it speculatively, and don't
re-use old results, and keep the specified ordering between all other
volatile reads and writes".
You can get the same results by using inline assembly, or external
function calls, or anything else that also specifically forces the read
or write. Most operating systems have some sort of calls or macros to
do this.
If you use normal writes, you will /often/ get the same results as a
volatile write - but you have no guarantees. In particular, the
compiler can "save up" the write and do it later, it can roll together
multiple writes to the same address, it can (in some circumstances, such
as with file or function static data) omit the write altogether or use
multiple writes, and it can re-order the write around any other reads
and writes.
This is because the issues related to making code multi-processor safe
are related to memory barriers and memory cache behaviour. Adding a
volatile qualifier will not change the program's behaviour with respect
to either.
Is caching the reason that makes another process sharing a memory
address different than a piece of hardware sharing a memory address?
Caching is one of the reasons, and is usually the culprit when reading
data does not give the expected results - even if the read is marked
"volatile". The compiler cannot force the memory system to give the
results from main memory - that depends on things like the MMU setup,
the cache hardware, cache snooping and cache consistency hardware and
setup, etc. Reads are also affected by speculation - speculative reads,
speculative execution, branch prediction, etc., which can wildly
re-order reads.
Similarly, when you write data - even with volatile writes (or writes
followed by a memory barrier) - the compiler knows nothing about caches.
Writes typically have additional buffers and queues, and are
re-ordered for memory bus efficiency.
This is why "volatile" is generally not enough - and why you are
normally better off using the OS's API for such shared data. The
/implementation/ of such API's typically makes use of "volatile"
accesses - but also often other things, such as cache control assembly
instructions.
For memory-mapped hardware, the MMU is /usually/ configured so that such
accesses are straight-through, and are not cached in reads or writes -
thus "volatile" accesses are /usually/ enough. But you would have to
check the details for your OS and target to see how it configures such
things.