Ralf Baechle wrote: > On Wed, Mar 03, 2010 at 12:03:45PM +0000, Catalin Marinas wrote: > > > > /* We need for force the visibility of tp->intr_mask > > > > * for other CPUs, as we can loose an MSI interrupt > > > > * and potentially wait for a retransmit timeout if we don't. > > > > * The posted write to IntrMask is safe, as it will > > > > * eventually make it to the chip and we won't loose anything > > > > * until it does. > > > > */ > > > > tp->intr_mask = 0xffff; > > > > smp_wmb(); > > > > RTL_W16(IntrMask, tp->intr_event); > > > > > > > > Is this supposed to work given the SMP barriers semantics? > > > > > > Well, if the smp_wmb() is supposed to make the assignment to > > > tp->intr_mask globally visible before any effects of the RTL_W16(), > > > then it's buggy. But from the comments it appears that the smp_wmb() > > > might be intended to order the store to tp->intr_mask with respect to > > > following cacheable stores, rather than with respect to the RTL_W16(), > > > which would be OK. I can't say without having a much closer look at > > > what that driver is actually doing. > > > > I cc'ed the r8169.c maintainer. > > > > But from the architectural support perspective, we don't need to support > > more than a lightweight barrier in this case. If the ordering relative to RTL_W16 doesn't matter, imho it would be much clearer to move the RTL_W16() somewhere else, such as before the comment. The comment is quite misleading. > Be afraid, very afraid when you find a non-SMP memory barrier in the > kernel. A while ago I reviewed a number of uses throughout the kernel and > each one of them was somehow buggy - either entirely unnecessary or should > be replaced with an SMP memory barrier or was simple miss-placed. It's not hard to think of cases where a non-SMP barrier is necessary outside of the kernel code API (dma_* etc.), but it would be quite interesting if those cases never occur with real devices, or if they can always be transformed to something else. The RTL_W16 sample is quite interesting - even if it does not apply to that particular driver, it's suggestive of a pattern where some device may need to delay triggering an interrupt (on a different CPU) until a data structure is written, or that some device may require ordered MMIO reads and writes. In the case of interrupts, code like that can probably be transformed. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html