On 04/12/2018 07:56 AM, Christoph Hellwig wrote:
On Thu, Apr 12, 2018 at 04:51:23PM +0200, Christoph Hellwig wrote:On Thu, Apr 12, 2018 at 03:50:29PM +0200, Jesper Dangaard Brouer wrote:--------------- Implement support for keeping the DMA mapping through the XDP return call, to remove RX map/unmap calls. Implement bulking for XDP ndo_xdp_xmit and XDP return frame API. Bulking allows to perform DMA bulking via scatter-gatter DMA calls, XDP TX need it for DMA map+unmap. The driver RX DMA-sync (to CPU) per packet calls are harder to mitigate (via bulk technique). Ask DMA maintainer for a common case direct call for swiotlb DMA sync call ;-)Why do you even end up in swiotlb code? Once you bounce buffer your performance is toast anyway..I guess that is because x86 selects it as the default as soon as we have more than 4G memory. That should be solveable fairly easily with the per-device dma ops, though.\
I guess there is nothing we need to do! On x86, in case of no intel iommu or iommu is disabled, you end up in swiotlb for DMA API calls when system has 4G memory. However, AFAICT, for 64bit DMA capable devices swiotlb DMA APIs do not use bounce buffer until and unless you have swiotlb=force specified in kernel commandline. e.g. here is the snip: dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, unsigned long attrs) { phys_addr_t map, phys = page_to_phys(page) + offset; dma_addr_t dev_addr = phys_to_dma(dev, phys); BUG_ON(dir == DMA_NONE); /* * If the address happens to be in the device's DMA window, * we can safely return the device addr and not worry about bounce * buffering it. */if (dma_capable(dev, dev_addr, size) && swiotlb_force != SWIOTLB_FORCE)
return dev_addr; -Tushar