Re: [PATCH 17/28] ARC: add compiler barrier to LLSC based cmpxchg

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 9 Jun 2015 14:23:27 +0200

On Tue, Jun 09, 2015 at 05:18:17PM +0530, Vineet Gupta wrote:
> When auditing cmpxchg call sites, Chuck noted that gcc was optimizing
> away some of the desired LDs.
> 
> |	do {
> |		new = old = *ipi_data_ptr;
> |		new |= 1U << msg;
> |	} while (cmpxchg(ipi_data_ptr, old, new) != old);
> 
> was generating to below
> 
> | 8015cef8:	ld         r2,[r4,0]  <-- First LD
> | 8015cefc:	bset       r1,r2,r1
> |
> | 8015cf00:	llock      r3,[r4]  <-- atomic op
> | 8015cf04:	brne       r3,r2,8015cf10
> | 8015cf08:	scond      r1,[r4]
> | 8015cf0c:	bnz        8015cf00
> |
> | 8015cf10:	brne       r3,r2,8015cf00  <-- Branch doesn't go to orig LD
> 
> Although this was fixed by adding a ACCESS_ONCE in this call site, it
> seems safer (for now at least) to add compiler barrier to LLSC based
> cmpxchg

This is required even. cmpxchg() should include a full memory barrier
_before_ and _after_ the op. Both imply a compiler barrier.

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html