On Mon, Nov 02, 2015 at 07:32:19PM +0200, Shamir Rabinovitch wrote: > Correct. This issue is one of the concerns here in the previous replies. > I will take different approach which will not require the IOMMU bypass > per mapping. Will try to shift to the x86 'iommu=pt' approach. Yeah, it doesn't really make sense to have an extra remappable area when the device can access all physical memory anyway. > We had a bunch of issues around SPARC IOMMU. Not all of them relate to > performance. The first issue was that on SPARC, currently, we only have > limited address space to IOMMU so we had issue to do large DMA mappings > for Infiniband. Second issue was that we identified high contention on > the IOMMU locks even in ETH driver. Contended IOMMU locks are not only a problem on SPARC, but on x86 and various other IOMMU drivers too. But I have some ideas on how to improve the situation there. > I do not want to put too much information here but you can see some results: > > rds-stress test from sparc t5-2 -> x86: > > with iommu bypass: > --------------------- > sparc->x86 cmdline = -r XXX -s XXX -q 256 -a 8192 -T 10 -d 10 -t 3 -o XXX > tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % > 3 141278 0 1165565.81 0.00 0.00 8.93 376.60 -1.00 (average) > > without iommu bypass: > --------------------- > sparc->x86 cmdline = -r XXX -s XXX -q 256 -a 8192 -T 10 -d 10 -t 3 -o XXX > tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % > 3 78558 0 648101.41 0.00 0.00 15.05 876.72 -1.00 (average) > > + RDMA tests are totally not working (might be due to failure to DMA map all the memory). > > So IOMMU bypass give ~80% performance boost. Interesting. Have you looked more closely on what causes the performance degradation? Is it the lock contention or something else? Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html