On Mon, 1 Aug 2011, Sebastian Andrzej Siewior wrote: > * Laurent Pinchart | 2011-08-01 13:13:15 [+0200]: > > >Hi Sebastian, > Hi Laurent, > > >> What about using kmalloc() + dma_map_single() + dma_unmap_single() > >> instead of arch dependend code in drivers? Those are nops x86 and > >> perform the required syncs on other architectures. > > > >Do you mean creating and tearing down the mapping for each transform ? Isn't > >that very costly on many non-x86 platforms ? > > You need consistent memory between hardware and cpu and this leaves you > with two options: > - use uncached memory > - use cached memory and flush it > > If the architecture is cache coherent then you don't have to worry, the > memory controller will snoop the cache line and invalidate the cache > line or update the memory. If this is not the case then you have to do > this on your own. > Using uncached memory has the benefit that reads/writes are performed > directly to main memory but since you bypass the cache, every access > goes across the bus. So uncached memory is probably a good if you need > to access the memory just once or twice. That's exactly the problem. In uvcvideo, the memory _is_ accessed just once. But even that once is slow enough to cause problems, according to the original complaint. > The dma ops usually flush the cache operations. So once the cache line > for this memory address is clean then the CPU can fill its cache by > fetching a bigger memory block instead of 4 bytes only (or 1 if the > memcpy works on bytes). > > Is it possible to align the buffers somehow differrently and avoid the > memcpy in the first place? IIUC, there's no way to avoid copying the data since it has to processed before getting sent to userspace. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html