Hi Sebastian and Laurent, On Mon, Aug 1, 2011 at 7:47 AM, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > * Laurent Pinchart | 2011-08-01 13:13:15 [+0200]: > >>Hi Sebastian, > Hi Laurent, > >>> What about using kmalloc() + dma_map_single() + dma_unmap_single() >>> instead of arch dependend code in drivers? Those are nops x86 and >>> perform the required syncs on other architectures. >> >>Do you mean creating and tearing down the mapping for each transform ? Isn't >>that very costly on many non-x86 platforms ? > > You need consistent memory between hardware and cpu and this leaves you > with two options: > - use uncached memory > - use cached memory and flush it > > If the architecture is cache coherent then you don't have to worry, the > memory controller will snoop the cache line and invalidate the cache > line or update the memory. If this is not the case then you have to do > this on your own. > Using uncached memory has the benefit that reads/writes are performed > directly to main memory but since you bypass the cache, every access > goes across the bus. So uncached memory is probably a good if you need > to access the memory just once or twice. > The dma ops usually flush the cache operations. So once the cache line > for this memory address is clean then the CPU can fill its cache by > fetching a bigger memory block instead of 4 bytes only (or 1 if the > memcpy works on bytes). > > Is it possible to align the buffers somehow differrently and avoid the > memcpy in the first place? > > Sebastian > Thanks very much for the feedback. I don't think it's possible to remove the need for the memcpy because the ISOC USB transfer data in the buffer has holes and must be coalesced back into a contiguous stream. If I understand correctly, there are two primary ways for a driver like UVC to get buffers that will be used to hold USB transfer data. 1. Use kmalloc() and pass the buffer address in the URB. In this case the USB driver will handle all DMA mapping and cache coherency issues on each transfer. 2. Use usb_alloc_coherent(), pass the address in the URB and set the URB_NO_TRANSFER_DMA_MAP in the URB"s "transfer_flag". This will map the buffer once when allocated and the system will not do any per transfer mapping or cache management. Currently I see that sound, video, HID and network USB drivers all use usb_alloc_coherent. To solve the performance issues on systems without cache coherent DMA, it looks like I'll have to change all these drivers (with the possible exception of HID) to use kmalloc. It seems like the reason to choose between the two methods is for platform specific performance considerations. Ideally this kind of choice would be made once in the USB driver instead of by every driver that sits on top of the USB driver. Thanks Al -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html