On 16/10/2023 03:21, Baolu Lu wrote: > On 9/23/23 9:25 AM, Joao Martins wrote: > [...] >> +/* >> + * Set up dirty tracking on a second only translation type. > > Set up dirty tracking on a second only or nested translation type. > >> + */ >> +int intel_pasid_setup_dirty_tracking(struct intel_iommu *iommu, >> + struct dmar_domain *domain, >> + struct device *dev, u32 pasid, >> + bool enabled) >> +{ >> + struct pasid_entry *pte; >> + u16 did, pgtt; >> + >> + spin_lock(&iommu->lock); >> + >> + did = domain_id_iommu(domain, iommu); >> + pte = intel_pasid_get_entry(dev, pasid); >> + if (!pte) { >> + spin_unlock(&iommu->lock); >> + dev_err(dev, "Failed to get pasid entry of PASID %d\n", pasid); > > Use dev_err_ratelimited() to avoid user DOS attack. > OK >> + return -ENODEV; >> + } > > Can we add a check to limit this interface to second-only and nested > translation types? These are the only valid use cases currently and for > the foreseeable future. > OK. > And, return directly if the pasid bit matches the target state. > OK > [...] > spin_lock(&iommu->lock); > pte = intel_pasid_get_entry(dev, pasid); > if (!pte) { > spin_unlock(&iommu->lock); > dev_err_ratelimited(dev, "Failed to get pasid entry of PASID > %d\n", pasid); > return -ENODEV; > } > > did = domain_id_iommu(domain, iommu); > pgtt = pasid_pte_get_pgtt(pte); > > if (pgtt != PASID_ENTRY_PGTT_SL_ONLY && pgtt != PASID_ENTRY_PGTT_NESTED) { > spin_unlock(&iommu->lock); > dev_err_ratelimited(dev, > "Dirty tracking not supported on translation > type %d\n", > pgtt); > return -EOPNOTSUPP; > } > > if (pasid_get_ssade(pte) == enabled) { > spin_unlock(&iommu->lock); > return 0; > } > > if (enabled) > pasid_set_ssade(pte); > else > pasid_clear_ssade(pte); > spin_unlock(&iommu->lock); > [...] > OK >> + >> + pgtt = pasid_pte_get_pgtt(pte); >> + >> + if (enabled) >> + pasid_set_ssade(pte); >> + else >> + pasid_clear_ssade(pte); >> + spin_unlock(&iommu->lock); > > > Add below here: > > if (!ecap_coherent(iommu->ecap)) > clflush_cache_range(pte, sizeof(*pte)); > Got it >> + >> + /* >> + * From VT-d spec table 25 "Guidance to Software for Invalidations": >> + * >> + * - PASID-selective-within-Domain PASID-cache invalidation >> + * If (PGTT=SS or Nested) >> + * - Domain-selective IOTLB invalidation >> + * Else >> + * - PASID-selective PASID-based IOTLB invalidation >> + * - If (pasid is RID_PASID) >> + * - Global Device-TLB invalidation to affected functions >> + * Else >> + * - PASID-based Device-TLB invalidation (with S=1 and >> + * Addr[63:12]=0x7FFFFFFF_FFFFF) to affected functions >> + */ >> + pasid_cache_invalidation_with_pasid(iommu, did, pasid); >> + >> + if (pgtt == PASID_ENTRY_PGTT_SL_ONLY || pgtt == PASID_ENTRY_PGTT_NESTED) >> + iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); >> + else >> + qi_flush_piotlb(iommu, did, pasid, 0, -1, 0); > > Only "Domain-selective IOTLB invalidation" is needed here. > Will delete the qi_flush_piotlb() then. >> + >> + /* Device IOTLB doesn't need to be flushed in caching mode. */ >> + if (!cap_caching_mode(iommu->cap)) >> + devtlb_invalidation_with_pasid(iommu, dev, pasid); > > For the device IOTLB invalidation, need to follow what spec requires. > > If (pasid is RID_PASID) > - Global Device-TLB invalidation to affected functions > Else > - PASID-based Device-TLB invalidation (with S=1 and > Addr[63:12]=0x7FFFFFFF_FFFFF) to affected functions > devtlb_invalidation_with_pasid() underneath does: if (pasid == PASID_RID2PASID) qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); else qi_flush_dev_iotlb_pasid(iommu, sid, pfsid, pasid, qdep, 0, 64 - VTD_PAGE_SHIFT); ... Which is what the spec suggests (IIUC). Should I read your comment above as to drop the cap_caching_mode(iommu->cap) ?