On 9/25/23 5:08 PM, Joao Martins wrote:
On 25/09/2023 08:01, Baolu Lu wrote:
On 9/23/23 9:25 AM, Joao Martins wrote:
IOMMU advertises Access/Dirty bits for second-stage page table if the
extended capability DMAR register reports it (ECAP, mnemonic ECAP.SSADS).
The first stage table is compatible with CPU page table thus A/D bits are
implicitly supported. Relevant Intel IOMMU SDM ref for first stage table
"3.6.2 Accessed, Extended Accessed, and Dirty Flags" and second stage table
"3.7.2 Accessed and Dirty Flags".
First stage page table is enabled by default so it's allowed to set dirty
tracking and no control bits needed, it just returns 0. To use SSADS, set
bit 9 (SSADE) in the scalable-mode PASID table entry and flush the IOTLB
via pasid_flush_caches() following the manual. Relevant SDM refs:
"3.7.2 Accessed and Dirty Flags"
"6.5.3.3 Guidance to Software for Invalidations,
Table 23. Guidance to Software for Invalidations"
PTE dirty bit is located in bit 9 and it's cached in the IOTLB so flush
IOTLB to make sure IOMMU attempts to set the dirty bit again. Note that
iommu_dirty_bitmap_record() will add the IOVA to iotlb_gather and thus
the caller of the iommu op will flush the IOTLB. Relevant manuals over
the hardware translation is chapter 6 with some special mention to:
"6.2.3.1 Scalable-Mode PASID-Table Entry Programming Considerations"
"6.2.4 IOTLB"
Signed-off-by: Joao Martins<joao.m.martins@xxxxxxxxxx>
---
The IOPTE walker is still a bit inneficient. Making sure the UAPI/IOMMUFD is
solid and agreed upon.
---
drivers/iommu/intel/iommu.c | 94 +++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.h | 15 ++++++
drivers/iommu/intel/pasid.c | 94 +++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/pasid.h | 4 ++
4 files changed, 207 insertions(+)
The code is probably incomplete. When attaching a domain to a device,
check the domain's dirty tracking capability against the device's
capabilities. If the domain's dirty tracking capability is set but the
device does not support it, the attach callback should return -EINVAL.
Yeap, I did that for AMD, but it seems in the mix of changes I may have deleted
and then forgot to include it here.
Here's what I added (together with consolidated cap checking):
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7d5a8f5283a7..fabfe363f1f9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4075,6 +4075,11 @@ static struct iommu_domain
*intel_iommu_domain_alloc(unsigned type)
return NULL;
}
+static bool intel_iommu_slads_supported(struct intel_iommu *iommu)
+{
+ return sm_supported(iommu) && ecap_slads(iommu->ecap);
+}
+
static struct iommu_domain *
intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
{
@@ -4090,7 +4095,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags)
return ERR_PTR(-EOPNOTSUPP);
if (enforce_dirty &&
- !device_iommu_capable(dev, IOMMU_CAP_DIRTY))
+ !intel_iommu_slads_supported(iommu))
return ERR_PTR(-EOPNOTSUPP);
domain = iommu_domain_alloc(dev->bus);
@@ -4121,6 +4126,9 @@ static int prepare_domain_attach_device(struct
iommu_domain *domain,
if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap))
return -EINVAL;
+ if (domain->dirty_ops && !intel_iommu_slads_supported(iommu))
+ return -EINVAL;
+
/* check if this iommu agaw is sufficient for max mapped address */
addr_width = agaw_to_width(iommu->agaw);
if (addr_width > cap_mgaw(iommu->cap))
@@ -4376,8 +4384,7 @@ static bool intel_iommu_capable(struct device *dev, enum
iommu_cap cap)
case IOMMU_CAP_ENFORCE_CACHE_COHERENCY:
return ecap_sc_support(info->iommu->ecap);
case IOMMU_CAP_DIRTY:
- return sm_supported(info->iommu) &&
- ecap_slads(info->iommu->ecap);
+ return intel_iommu_slads_supported(info->iommu);
default:
return false;
}
Yes. Above additional change looks good to me.
Best regards,
baolu