On 2024/11/7 15:57, Yi Liu wrote:
On 2024/11/7 14:53, Baolu Lu wrote:
On 2024/11/7 14:46, Yi Liu wrote:
On 2024/11/7 13:46, Tian, Kevin wrote:
From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
Sent: Thursday, November 7, 2024 12:21 PM
On 2024/11/7 10:52, Baolu Lu wrote:
On 11/6/24 23:45, Yi Liu wrote:
+int intel_pasid_replace_first_level(struct intel_iommu *iommu,
+ struct device *dev, pgd_t *pgd,
+ u32 pasid, u16 did, u16 old_did,
+ int flags)
+{
+ struct pasid_entry *pte;
+
+ if (!ecap_flts(iommu->ecap)) {
+ pr_err("No first level translation support on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>cap)) {
+ pr_err("No 5-level paging support for first-level on %s\n",
+ iommu->name);
+ return -EINVAL;
+ }
+
+ spin_lock(&iommu->lock);
+ pte = intel_pasid_get_entry(dev, pasid);
+ if (!pte) {
+ spin_unlock(&iommu->lock);
+ return -ENODEV;
+ }
+
+ if (!pasid_pte_is_present(pte)) {
+ spin_unlock(&iommu->lock);
+ return -EINVAL;
+ }
+
+ WARN_ON(old_did != pasid_get_domain_id(pte));
+
+ pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
+ spin_unlock(&iommu->lock);
+
+ intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
+ intel_iommu_drain_pasid_prq(dev, pasid);
+
+ return 0;
+}
pasid_pte_config_first_level() causes the pasid entry to
transition from
present to non-present and then to present. In this case, calling
intel_pasid_flush_present() is not accurate, as it is only
intended for
pasid entries transitioning from present to present, according to the
specification.
It's recommended to move pasid_clear_entry(pte) and
pasid_set_present(pte) out to the caller, so ...
For setup case (pasid from non-present to present):
- pasid_clear_entry(pte)
- pasid_pte_config_first_level(pte)
- pasid_set_present(pte)
- cache invalidations
For replace case (pasid from present to present)
- pasid_pte_config_first_level(pte)
- cache invalidations
The same applies to other types of setup and replace.
hmmm. Here is the reason I did it in the way of this patch:
1) pasid_clear_entry() can clear all the fields that are not
supposed to
be used by the new domain. For example, converting a nested
domain to
SS
only domain, if no pasid_clear_entry() then the FSPTR would be
there.
Although spec seems not enforce it, it might be good to clear it.
2) We don't support atomic replace yet, so the whole pasid entry
transition
is not done in one shot, so it looks to be ok to do this stepping
transition.
3) It seems to be even worse if keep the Present bit during the
transition.
The pasid entry might be broken while the Present bit
indicates this is
a valid pasid entry. Say if there is in-flight DMA, the result
may be
unpredictable.
Based on the above, I chose the current way. But I admit if we are
going to
support atomic replace, then we should refactor a bit. I believe at
that
time we need to construct the new pasid entry first and try to
exchange it
to the pasid table. I can see some transition can be done in that
way as we
can do atomic exchange with 128bits. thoughts? 🙂
yes 128bit cmpxchg is necessary to support atomic replacement.
Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
together in a present entry to not break in-flight DMA.
but... your current way (clear entry then update it) also break in-
flight
DMA. So let's admit that as the 1st step it's not aimed to support
atomic replacement. With that Baolu's suggestion makes more sense
toward future extension with less refactoring required (otherwise
you should not use intel_pasid_flush_present() then the earlier
refactoring for that helper is also meaningless).
I see. The pasid entry might have some filed that is not supposed to be
used after replacement. Should we have a comment about it?
I guess all fields except SSADE and P of a pasid table entry should be
cleared in pasid_pte_config_first_level()?
perhaps we can take one more step forward. We can construct the new pte
in a local variable first and then push it to the pte in the pasid
table. :)
That sounds better! The entry is composed on the stack and then copied
over to the pasid table as a whole.
With these two issues addressed, do u mind sending a new version? Let's
try to catch the pull request window. There are other series (iommufd
and vfio) about user space PASID support depending on this.
--
baolu