On Fri, Apr 13, 2018 at 10:12:41AM -0700, Tushar Dave wrote: > I guess there is nothing we need to do! > > On x86, in case of no intel iommu or iommu is disabled, you end up in > swiotlb for DMA API calls when system has 4G memory. > However, AFAICT, for 64bit DMA capable devices swiotlb DMA APIs do not > use bounce buffer until and unless you have swiotlb=force specified in > kernel commandline. Sure. But that means very sync_*_to_device and sync_*_to_cpu now involves an indirect call to do exactly nothing, which in the workload Jesper is looking at is causing a huge performance degradation due to retpolines.