> x86 has compiler barrier inside the relaxed() API so that code does not > get reordered. ARM64 architecturally guarantees device writes to be observed > in order. There are places where you don't even need a compile barrier between every write. I had horrid problems getting some ppc code (for a specific embedded SoC) optimised to have no extra barriers. I ended up just writing through 'pointer to volatile' and adding an explicit 'eieio' between the block of writes and status read. No less painful was doing a byteswapping write to normal memory. David ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f