On Mon, Jul 20, 2015 at 02:48:49PM +0100, Paul E. McKenney wrote: > On Mon, Jul 20, 2015 at 02:39:06PM +0100, Will Deacon wrote: > > On Fri, Jul 17, 2015 at 11:14:14PM +0100, Benjamin Herrenschmidt wrote: > > > On Fri, 2015-07-17 at 10:32 +0100, Will Deacon wrote: > > > > static inline void arch_spin_unlock(arch_spinlock_t *lock) > > > > { > > > > - SYNC_IO; > > > > - __asm__ __volatile__("# arch_spin_unlock\n\t" > > > > - PPC_RELEASE_BARRIER: : :"memory"); > > > > + smp_mb(); > > > > lock->slock = 0; > > > > } > > > > > > That probably needs to be mb() in case somebody has the expectation that > > > it does a barrier vs. DMA on UP. > > > > Hmm, but on !SMP doesn't arch_spin_unlock effectively expand to barrier() > > in the core code? > > Yes, to barrier(), but that doesn't generate any code. In contrast, the > mb() that Ben is asking for puts out a sync instruction. Without that > sync instruction, MMIO accesses can be reordered with the spin_unlock(), > even on single-CPU systems. So the bm() is really needed if unlock is > to order against MMIO (and thus DMA) on UP. Understood, but my point was that this needs to be done in the core code rather than here. Perhaps it's easier to leave mmiowb() alone for PowerPC for now and instead predicate that on !SMP? Will -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html