On Mon, Aug 25, 2008 at 08:46:01PM +0200, Thomas Bogendoerfer wrote: > On Mon, Aug 25, 2008 at 09:34:29AM -0700, David Daney wrote: > > What is the reasoning for only doing the cache operation on R10K based > > systems? > > non coherent R10k need after DMA operations to get rid of remains > of load/store speculations. Other CPUs don't pollute the cache > after it got flushed. > > But this optimization is wrong, we need to do the flush for > every non coherent device otherwise polling a descriptor via > a cached mapping can't work. And this exactly what E100 does. When polling the buffer basically changes ownership between CPU and device and buffer all the time, so a drivers needs to do a dma_sync_*_for_cpu call before looking at the buffer, then dma_sync_*_for_device to return the buffer to the device. So to polling loop will work fine as long as one of the two calls does the flush operation. In fact we're even doing double flushes for the case of non-coherent R10000s ... Ralf