> From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > Sent: Thursday, November 7, 2024 12:21 PM > > On 2024/11/7 10:52, Baolu Lu wrote: > > On 11/6/24 23:45, Yi Liu wrote: > >> +int intel_pasid_replace_first_level(struct intel_iommu *iommu, > >> + struct device *dev, pgd_t *pgd, > >> + u32 pasid, u16 did, u16 old_did, > >> + int flags) > >> +{ > >> + struct pasid_entry *pte; > >> + > >> + if (!ecap_flts(iommu->ecap)) { > >> + pr_err("No first level translation support on %s\n", > >> + iommu->name); > >> + return -EINVAL; > >> + } > >> + > >> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) { > >> + pr_err("No 5-level paging support for first-level on %s\n", > >> + iommu->name); > >> + return -EINVAL; > >> + } > >> + > >> + spin_lock(&iommu->lock); > >> + pte = intel_pasid_get_entry(dev, pasid); > >> + if (!pte) { > >> + spin_unlock(&iommu->lock); > >> + return -ENODEV; > >> + } > >> + > >> + if (!pasid_pte_is_present(pte)) { > >> + spin_unlock(&iommu->lock); > >> + return -EINVAL; > >> + } > >> + > >> + WARN_ON(old_did != pasid_get_domain_id(pte)); > >> + > >> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags); > >> + spin_unlock(&iommu->lock); > >> + > >> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte); > >> + intel_iommu_drain_pasid_prq(dev, pasid); > >> + > >> + return 0; > >> +} > > > > pasid_pte_config_first_level() causes the pasid entry to transition from > > present to non-present and then to present. In this case, calling > > intel_pasid_flush_present() is not accurate, as it is only intended for > > pasid entries transitioning from present to present, according to the > > specification. > > > > It's recommended to move pasid_clear_entry(pte) and > > pasid_set_present(pte) out to the caller, so ... > > > > For setup case (pasid from non-present to present): > > > > - pasid_clear_entry(pte) > > - pasid_pte_config_first_level(pte) > > - pasid_set_present(pte) > > - cache invalidations > > > > For replace case (pasid from present to present) > > > > - pasid_pte_config_first_level(pte) > > - cache invalidations > > > > The same applies to other types of setup and replace. > > hmmm. Here is the reason I did it in the way of this patch: > 1) pasid_clear_entry() can clear all the fields that are not supposed to > be used by the new domain. For example, converting a nested domain to > SS > only domain, if no pasid_clear_entry() then the FSPTR would be there. > Although spec seems not enforce it, it might be good to clear it. > 2) We don't support atomic replace yet, so the whole pasid entry transition > is not done in one shot, so it looks to be ok to do this stepping > transition. > 3) It seems to be even worse if keep the Present bit during the transition. > The pasid entry might be broken while the Present bit indicates this is > a valid pasid entry. Say if there is in-flight DMA, the result may be > unpredictable. > > Based on the above, I chose the current way. But I admit if we are going to > support atomic replace, then we should refactor a bit. I believe at that > time we need to construct the new pasid entry first and try to exchange it > to the pasid table. I can see some transition can be done in that way as we > can do atomic exchange with 128bits. thoughts? :) > yes 128bit cmpxchg is necessary to support atomic replacement. Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated together in a present entry to not break in-flight DMA. but... your current way (clear entry then update it) also break in-flight DMA. So let's admit that as the 1st step it's not aimed to support atomic replacement. With that Baolu's suggestion makes more sense toward future extension with less refactoring required (otherwise you should not use intel_pasid_flush_present() then the earlier refactoring for that helper is also meaningless).