Hi, I have written inline assembler implementations of pa_atomic operations for arm for ARM6 and above. For compatibility with older ARMs I have also written versions using ARM-Linux kernel helper functions (see http://0pointer.de/blog#atomic-rt). The both implementations run almost perfectly now. However there is a thing to note about compare and exchange implementations for ARM6 and above. The semantics of the usual ARM ldrex-strexeq instruction sequence is not identical to x86 implementations of the same thing, e.g. the exchange is not totally atomic after all. The strexeq-instruction has two conditions the equality to the old value and the exclusiveness of the operation (e.g. if the value in memory was tampered between the operations). The operation fails if either of these conditions fail, e.g. the value in memory is unchanged. So it is possible that the old-value-condition is met, but the exclusiveness- condition fails, but even the tampered memory value would meet the old-value-condition. The above applies also to kernel helper implementation of atomic exchange for ARM6 and above. Because of the above problem (I suspect) this assertion in pulsecore/async.c fails sometimes under heavy load: /* Guaranteed to succeed if we only have a single reader */ pa_assert_se(pa_atomic_ptr_cmpxchg(&cells[idx], ret, NULL)); The assertion failure has happened with the both kernel helper and inline asm versions (they are identical in ARM6 environment anyway). The failures are not very common thou. The atomic compare and exchange can also be written in a way that it retries the operation if the exclusiveness-condition fails but the equality-condition was ok, which would resemble real atomicity more. The inline assembler version would then look like this: static inline int pa_atomic_cmpxchg(pa_atomic_t *a, int old_i, int new_i) { unsigned long not_equal, not_exclusive; pa_memory_barrier(); do { __asm__ __volatile__("@ pa_atomic_cmpxchg\n" "1: ldrex %0, [%2]\n" " subs %0, %0, %3\n" " mov %1, %0\n" " strexeq %0, %4, [%2]\n" : "=&r" (not_exclusive), "=&r" (not_equal) : "r" (&a->value), "Ir" (old_i), "r" (new_i) : "cc"); } while(not_exclusive && !not_equal); pa_memory_barrier(); return !not_equal; } A similar kind of external loop can also be added to kernel helper function, but if the kernel helper in fact makes a systemcall it is unnecessary. I wonder if all this is worth the trouble. So what should be done? 1. Change the above line in pulsecore/async.c to use pa_atomic_store instead and try to look if there are other similar places. 2. Write loops like above to ARM specific implementations atomic compare and exchange. Any way I'll produce a proper ARM atomic ops patch as soon as I am happy with it. However it may take a while because I am still only learning the autoconf magic and I have some other tasks I should take care of too. Cheers, Jyri // Jyri Sarha -- my.name at nokia.com