Hi Allen, > From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On Behalf > Of Lu Baolu > Sent: Wednesday, December 11, 2019 10:12 AM > To: Joerg Roedel <joro@xxxxxxxxxx>; David Woodhouse <dwmw2@xxxxxxxxxxxxx>; > Subject: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level > > When software has changed first-level tables, it should invalidate > the affected IOTLB and the paging-structure-caches using the PASID- > based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4. > > Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> > --- > drivers/iommu/dmar.c | 41 ++++++++++++++++++++++++++++++++++ > drivers/iommu/intel-iommu.c | 44 ++++++++++++++++++++++++------------- > include/linux/intel-iommu.h | 2 ++ > 3 files changed, 72 insertions(+), 15 deletions(-) > > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c > index 3acfa6a25fa2..fb30d5053664 100644 > --- a/drivers/iommu/dmar.c > +++ b/drivers/iommu/dmar.c > @@ -1371,6 +1371,47 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 > sid, u16 pfsid, > qi_submit_sync(&desc, iommu); > } > > +/* PASID-based IOTLB invalidation */ > +void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr, > + unsigned long npages, bool ih) > +{ > + struct qi_desc desc = {.qw2 = 0, .qw3 = 0}; > + > + /* > + * npages == -1 means a PASID-selective invalidation, otherwise, > + * a positive value for Page-selective-within-PASID invalidation. > + * 0 is not a valid input. > + */ > + if (WARN_ON(!npages)) { > + pr_err("Invalid input npages = %ld\n", npages); > + return; > + } > + > + if (npages == -1) { > + desc.qw0 = QI_EIOTLB_PASID(pasid) | > + QI_EIOTLB_DID(did) | > + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) | > + QI_EIOTLB_TYPE; > + desc.qw1 = 0; > + } else { > + int mask = ilog2(__roundup_pow_of_two(npages)); > + unsigned long align = (1ULL << (VTD_PAGE_SHIFT + mask)); > + > + if (WARN_ON_ONCE(!ALIGN(addr, align))) > + addr &= ~(align - 1); > + > + desc.qw0 = QI_EIOTLB_PASID(pasid) | > + QI_EIOTLB_DID(did) | > + QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) | > + QI_EIOTLB_TYPE; > + desc.qw1 = QI_EIOTLB_ADDR(addr) | > + QI_EIOTLB_IH(ih) | > + QI_EIOTLB_AM(mask); > + } > + > + qi_submit_sync(&desc, iommu); > +} > + > /* > * Disable Queued Invalidation interface. > */ > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 83a7abf0c4f0..e47f5fe37b59 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -1520,18 +1520,24 @@ static void iommu_flush_iotlb_psi(struct intel_iommu > *iommu, > > if (ih) > ih = 1 << 6; > - /* > - * Fallback to domain selective flush if no PSI support or the size is > - * too big. > - * PSI requires page size to be 2 ^ x, and the base address is naturally > - * aligned to the size > - */ > - if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu- > >cap)) > - iommu->flush.flush_iotlb(iommu, did, 0, 0, > - DMA_TLB_DSI_FLUSH); > - else > - iommu->flush.flush_iotlb(iommu, did, addr | ih, mask, > - DMA_TLB_PSI_FLUSH); > + > + if (domain_use_first_level(domain)) { > + qi_flush_piotlb(iommu, did, domain->default_pasid, > + addr, pages, ih); I'm not sure if my understanding is correct. But let me tell a story. Assuming we assign a mdev and a PF/VF to a single VM, then there will be p_iotlb tagged with PASID_RID2PASID and p_iotlb tagged with default_pasid. We may want to flush both... If this operation is invoked per-device, then need to pass in a hint to indicate whether to use PASID_RID2PASID or default_pasid, or you may just issue two flush with the two PASID values. Thoughts? Regards, Yi Liu