On Wed, May 06, 2015 at 09:01:07AM +0200, Bart Van Assche wrote: > On 05/06/15 00:38, Jason Gunthorpe wrote: > >Heck, on modern systems 100% of these requirements can be solved just > >by using the IOMMU. No need for the HCA at all. (HCA may be more > >performant, of course) > > Hello Jason, > > Any performance tests I have run so far with the IOMMU enabled show > much worse results than the same test with the IOMMU disabled. The > perf tool learned me that this performance difference is due to lock > contention caused by the IOMMU kernel code. I have not yet tried to > verify whether this is an implementation issue or something > fundamental. I'm not surprised, I think that is well known. Just to be clear I'm not saying we should rely on the IOMMU, or even implement anything that uses it - but as a thought exercise, the fact we could implement a page list API entirely with the dumbest HCA and the IOMMU suggests strongly to me it is a sane API direction to look at. If you did have a dumb HCA, using the IOMMU is probably alot faster that doing a heavy MR registration or doing operations 'page at a time'. Which would be slower than using a smart HCA with the IOMMU turned off, for that work load. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html