On Fri, Oct 01, 2010 at 02:45:17PM -0700, David Daney wrote: > In user space the rmb() must expand to a SYNC instruction. I am not > sure what your version in the patch is doing with all those NOPs. That > is not guaranteed to do anything. That's a rather old version of the kernel rmb macro I think. The NOPs where there to enforce ordering of a mix of cached and uncached accesses on the R4400 (not R4000) where according to my reading the manual leaves it a bit unclear if a SYNC is sufficient or if the pipeline needs to be drained in addition. See version 2 of the R4000/R4400 User's Manual. > The instruction set specifications say that SYNC orders all loads and > stores. This is a heaver operation than rmb() demands, but is the only > universally available instruction that imposes ordering. > > For processors that do not support SYNC, the kernel will emulate it, so > it is safe to use in userspace. I wouldn't worry about emulation > overhead though, because processors that lack SYNC probably also lack > performance counters, so are not as interesting from a perf-tool point > of view. Yes, just use SYNC. SYNC-less processors would only be R2000/R3000 processors and a few other oddball processors which for performance optimization are totally uninteresting since years. Ralf