On 2023-02-04 06:32, Baolu Lu wrote:
On 2023/2/4 7:04, Jacob Pan wrote:
Intel IOMMU driver implements IOTLB flush queue with domain selective
or PASID selective invalidations. In this case there's no need to track
IOVA page range and sync IOTLBs, which may cause significant performance
hit.
[Add cc Robin]
If I understand this patch correctly, this might be caused by below
helper:
/**
* iommu_iotlb_gather_add_page - Gather for page-based TLB invalidation
* @domain: IOMMU domain to be invalidated
* @gather: TLB gather data
* @iova: start of page to invalidate
* @size: size of page to invalidate
*
* Helper for IOMMU drivers to build invalidation commands based on
individual
* pages, or with page size/table level hints which cannot be gathered
if they
* differ.
*/
static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
struct
iommu_iotlb_gather *gather,
unsigned long iova,
size_t size)
{
/*
* If the new page is disjoint from the current range or is
mapped at
* a different granularity, then sync the TLB so that the gather
* structure can be rewritten.
*/
if ((gather->pgsize && gather->pgsize != size) ||
iommu_iotlb_gather_is_disjoint(gather, iova, size))
iommu_iotlb_sync(domain, gather);
gather->pgsize = size;
iommu_iotlb_gather_add_range(gather, iova, size);
}
As the comments for iommu_iotlb_gather_is_disjoint() says,
"...For many IOMMUs, flushing the IOMMU in this case is better
than merging the two, which might lead to unnecessary invalidations.
..."
So, perhaps the right fix for this performance issue is to add
if (!gather->queued)
in iommu_iotlb_gather_add_page() or iommu_iotlb_gather_is_disjoint()?
It should benefit other arch's as well.
The iotlb_gather helpers are really just that - little tools to help
drivers with various common iotlb_gather accounting patterns. The
decision whether to bother with that accounting at all should really
come beforehand, and whether a driver supports flush queues is
orthogonal to whether it uses any particular gather helper(s) or not, so
I think the patch as-is is correct.
This patch adds a check to avoid IOVA gather page and IOTLB sync for
the lazy path.
The performance difference on Sapphire Rapids 100Gb NIC is improved by
the following (as measured by iperf send):
Which test case have you done? Post the real data if you have any.
w/o this fix~48 Gbits/s. with this fix ~54 Gbits/s
Cc: <stable@xxxxxxxxxxxxxxx>
Again, add a Fixes tag so that people know how far this fix should be
back ported.
Note that the overall issue probably dates back to the initial iommu-dma
conversion, but if you think it's important enough to go back beyond
5.15 when gather->queued was introduced, that'll need a different fix.
Cheers,
Robin.