On 25/09/2023 08:01, Baolu Lu wrote: > On 9/23/23 9:25 AM, Joao Martins wrote: >> IOMMU advertises Access/Dirty bits for second-stage page table if the >> extended capability DMAR register reports it (ECAP, mnemonic ECAP.SSADS). >> The first stage table is compatible with CPU page table thus A/D bits are >> implicitly supported. Relevant Intel IOMMU SDM ref for first stage table >> "3.6.2 Accessed, Extended Accessed, and Dirty Flags" and second stage table >> "3.7.2 Accessed and Dirty Flags". >> >> First stage page table is enabled by default so it's allowed to set dirty >> tracking and no control bits needed, it just returns 0. To use SSADS, set >> bit 9 (SSADE) in the scalable-mode PASID table entry and flush the IOTLB >> via pasid_flush_caches() following the manual. Relevant SDM refs: >> >> "3.7.2 Accessed and Dirty Flags" >> "6.5.3.3 Guidance to Software for Invalidations, >> Table 23. Guidance to Software for Invalidations" >> >> PTE dirty bit is located in bit 9 and it's cached in the IOTLB so flush >> IOTLB to make sure IOMMU attempts to set the dirty bit again. Note that >> iommu_dirty_bitmap_record() will add the IOVA to iotlb_gather and thus >> the caller of the iommu op will flush the IOTLB. Relevant manuals over >> the hardware translation is chapter 6 with some special mention to: >> >> "6.2.3.1 Scalable-Mode PASID-Table Entry Programming Considerations" >> "6.2.4 IOTLB" >> >> Signed-off-by: Joao Martins<joao.m.martins@xxxxxxxxxx> >> --- >> The IOPTE walker is still a bit inneficient. Making sure the UAPI/IOMMUFD is >> solid and agreed upon. >> --- >> drivers/iommu/intel/iommu.c | 94 +++++++++++++++++++++++++++++++++++++ >> drivers/iommu/intel/iommu.h | 15 ++++++ >> drivers/iommu/intel/pasid.c | 94 +++++++++++++++++++++++++++++++++++++ >> drivers/iommu/intel/pasid.h | 4 ++ >> 4 files changed, 207 insertions(+) > > The code is probably incomplete. When attaching a domain to a device, > check the domain's dirty tracking capability against the device's > capabilities. If the domain's dirty tracking capability is set but the > device does not support it, the attach callback should return -EINVAL. > Yeap, I did that for AMD, but it seems in the mix of changes I may have deleted and then forgot to include it here. Here's what I added (together with consolidated cap checking): diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7d5a8f5283a7..fabfe363f1f9 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4075,6 +4075,11 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; } +static bool intel_iommu_slads_supported(struct intel_iommu *iommu) +{ + return sm_supported(iommu) && ecap_slads(iommu->ecap); +} + static struct iommu_domain * intel_iommu_domain_alloc_user(struct device *dev, u32 flags) { @@ -4090,7 +4095,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) return ERR_PTR(-EOPNOTSUPP); if (enforce_dirty && - !device_iommu_capable(dev, IOMMU_CAP_DIRTY)) + !intel_iommu_slads_supported(iommu)) return ERR_PTR(-EOPNOTSUPP); domain = iommu_domain_alloc(dev->bus); @@ -4121,6 +4126,9 @@ static int prepare_domain_attach_device(struct iommu_domain *domain, if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap)) return -EINVAL; + if (domain->dirty_ops && !intel_iommu_slads_supported(iommu)) + return -EINVAL; + /* check if this iommu agaw is sufficient for max mapped address */ addr_width = agaw_to_width(iommu->agaw); if (addr_width > cap_mgaw(iommu->cap)) @@ -4376,8 +4384,7 @@ static bool intel_iommu_capable(struct device *dev, enum iommu_cap cap) case IOMMU_CAP_ENFORCE_CACHE_COHERENCY: return ecap_sc_support(info->iommu->ecap); case IOMMU_CAP_DIRTY: - return sm_supported(info->iommu) && - ecap_slads(info->iommu->ecap); + return intel_iommu_slads_supported(info->iommu); default: return false; }