Re: SMP barriers semantics

Jamie Lokier <jamie@xxxxxxxxxxxxx> · Wed, 17 Mar 2010 13:42:43 +0000

Benjamin Herrenschmidt wrote:
> Maybe we can agree on a set of relaxed accessors defined specifically
> with those semantics (more relaxed implies use of raw_*) :
> 
>   - order is guaranteed between MMIOs
>   - no order is guaranteed between MMIOs and spinlocks

No order between MMIOs and spinlocks will be fun :-)

>   - a read access is not guaranteed to be performed until the read value
>     is actually consumed by the processor

How do you define 'consumed'?  It's not obvious: see
read_barrier_depends() on Alpha, and elsewhere speculative read
optimisations (including by the compiler on IA64, in theory).

> Along with barriers along the line of (that's where we may want to
> discuss a bit I believe)
> 
>   - io_to_mem_rbarrier() : orders MMIO read vs. subsequent memory read
>   - mem_to_io_wbarrier() : order memory write vs. subsequent MMIO write
>                            (includes spin_unlock ?)
>   - io_to_mem_barrier()  : order any MMIO s. subsequent memory accesses
>   - mem_to_io_barrier()  : order memory accesses vs any subsequent MMIO
>   - flush_io_read(val)   : takes value from MMIO read, enforce it's
>                            before completed further instructions are
>                            issued, acts as compiler&execution barrier. 
> 
> What do you guys think ? I think io_to_mem_rbarrier() and
> mem_to_io_wbarrier() cover most useful cases, except maybe the IO vs.
> locks.

I'm thinking the current scheme with "heavy" barriers that serialise
everything is better - because it's simple to understand.  I think the
above set are likely to be accidentally misused, and even pass review
occasionally when they are wrong, even if it's just due to brain farts.

The "heavy" barrier names could change from rmb/mb/wmb to
io_rmb/io_mb/io_wmb or something.

Is there any real performance advantage expected from more
fine-grained MMIO barriers?

Another approach is a flags argument to a single macro:
barrier(BARRIER_MMIO_READ_BEFORE | BARRIER_MEM_READ_AFTER), etc.
Each arch would expand it to a barrier at least as strong as the one
requested.  C++0x atomics are going down this route, I think.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html