On Mon, 29 Aug 2011, Russell King - ARM Linux wrote: > > You know better than I do what is needed to resolve the ordering issue. > > However, contrary to what the original patch description said, this > > isn't entirely a matter of making the write visible to the host > > controller: No doubt in time the write will eventually become visible > > anyway. It's a matter of making the write become visible reasonably > > quickly and in the correct order with respect to other writes. > > I'm not entirely sure what the problem is - I think its about a write > by the CPU to dma coherent memory being delayed and not being visible > to the HC in a timely manner. Either mb() or wmb() placed after the > write on ARM will do that - and ARM has no requirement to do a read- > back after the barrier. Okay, then this needs to be done in a way that won't slow down other architectures with an unnecessary memory barrier. And there needs to be a comment in the code explaining that the new mb() instruction isn't being used as a memory barrier but rather to expedite writeback of the L2 cache. This certainly is starting to sound like something that needs to be addressed in the arch-specific #include files... > > Is this extra L2-cache "poke" needed for proper ordering, or is it > > needed merely to flush the write out to memory in a timely manner? > > Both, though primerily it's about ensuring correct ordering. A side > effect of it is that it will flush all pending writes in L2 before > completing. > > From the theoretical viewpoint, I think I'm right to say that mb() > doesn't need to provide that level of ordering as its supposed to be > an inter-CPU barrier - which probably means we need to invent a new > barrier to deal with DMA memory ordering. However, given the > difficulty of getting the existing barriers placed correctly, I don't > think inventing new barriers is a very good idea. > > What we can do is view devices which perform DMA as being strongly > ordered with respect to their memory accesses - iow, they have an > implicit memory barrier before and after their accesses to memory. > This would make the CPUs use of mb() have a conceptual pairing with > the DMA agents. Yes, that's the model I have been using all along. After all, if a DMA master carries out its memory accesses in some random order then it's impossible for the CPU to make any guarantees. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html