There are 2 problems with the current implementation: 1. the memset on isochronous transfers to empty the buffers in order to avoid leaking raw memory to userspace (this costs a lot on intel Atoms and is also noticeable on other systems). 2. the memory fragmentation. Seems like recent systems have a better performance here since we did not get that report for several months now, or maybe the user behavior changed. Some older Linux systems (maybe 2-3 years old) triggered this issue way more often. The CPU usage decreases 1-2% on my 1.3ghz U7300 notebook The CPU usage decreases 6-8% on an Intel Atom n270 when transferring 20mbyte/sec (isochronous), it should be more interesting to see those statistics on embedded systems (eg. some older MIPS systems) where copying data is more expensive. I would not count on IOMMU in that case because several systems which should take benefit of a change in that area simply do not support IOMMU. You can search support.sundtek.com for reports about allocation issues you'll find quite a few. Best Regards, Markus On Tue, Nov 17, 2015 at 6:02 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 17 Nov 2015, Christoph Hellwig wrote: > >> On Mon, Nov 16, 2015 at 03:22:06PM -0500, Alan Stern wrote: >> > In other words, you're suggesting we do this: >> > >> > Check that userspace requested zerocopy (otherwise the user >> > program might try to access other data stored in the same cache >> > lines as the buffer while the I/O is in progres); >> > >> > Call get_user_pages (or get_user_pages_fast? -- it's not clear >> > which should be used) for this buffer; >> > >> > Use the array of pages returned by that routine to populate >> > a scatter-gather list (sg_alloc_table_from_pages); >> > >> > Pass that list to dma_map_sg. >> > >> > Is that right? >> >> Yes. >> >> > Does dma_map_sg check the page addresses against the DMA mask and >> > automatically create a bounce buffer, or do we have to do that >> > manually? Documentation/DMA-API-HOWTO.txt doesn't discuss this. >> >> You need to do this manually. > > I looked through the code. Christoph was wrong about this, at least on > systems that support CONFIG_SWIOTLB. Of course, using a bounce buffer > kind of defeats the purpose of zerocopy I/O, but I guess sometimes > there's no choice. > > AFAICT this leaves two questions. First, should we worry about systems > that don't support SWIOTLB? My feeling is probably not. In fact, the > existing DMA mapping code used for ordinary USB communications doesn't > try to handle mapping errors by setting up bounce buffers; it assumes > that dma_map_sg() takes care of all that. > > Second, how shall we ask user programs to indicate that they won't > access the buffer's memory pages during I/O? My suggestion is that we > add a USBDEVFS_URB_ZEROCOPY flag (and a corresponding capability bit). > Any objections? > > Alan Stern > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html