On Tue, Jul 07, 2015 at 01:51:54PM -0400, Waiman Long wrote: > >- cnts = atomic_add_return(_QR_BIAS,&lock->cnts) - _QR_BIAS; > >+ atomic_add(_QR_BIAS,&lock->cnts); > >+ cnts = smp_load_acquire((u32 *)&lock->cnts); > > rspin_until_writer_unlock(lock, cnts); > > > > /* > > Atomic add in x86 is actually a full barrier too. The performance difference > between "lock add" and "lock xadd" should be minor. The additional load, > however, could potentially cause an additional cacheline load on a contended > lock. So do you see actual performance benefit of this change in ARM? Yes, atomic_add() does not imply (and does not have) any memory barriers on ARM. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html