On Tue, 2014-07-15 at 11:38 +0200, Arnd Bergmann wrote: > More importantly: you do the same operation for both _for_cpu and _for_device. > I assume your CPU can never do speculative cache prefetches, so it's not > incorrect, but you do twice the number of invalidations and flushes that > you need. That's not necessarily a correct assumption. A lot of CPUs (x86, arm, parisc) feel entitled to speculate provided they have a TLB entry. Usually they don't just do it for a whim, so the cpu has to be doing something to cause the speculation, like reading from an adjacent page. However, for DMA you always have to assume the possibility (unless you really, really know the architecture cannot). Therefore the pattern should be (assuming your bus doesn't need some kind of flush and all flushes are only for the CPU). DMA_TO_DEVICE: flush before (_for_device) do nothing after (_for_cpu) DMA_FROM_DEVICE: do nothing before (_for_device), invalidate after (_for_cpu) DMA_BIDIRECTIONAL: flush before (_for_device) and invalidate after (_for_cpu). James -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html