> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Monday, October 16, 2023 7:58 PM > > On Mon, Oct 16, 2023 at 08:48:03AM +0000, Tian, Kevin wrote: > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx> > > > Sent: Saturday, October 14, 2023 8:45 AM > > > > > > On Tue, Sep 26, 2023 at 01:16:35AM -0700, Tian, Kevin wrote: > > > > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > > > > > Sent: Thursday, September 21, 2023 3:51 PM > > > > > > > > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx> > > > > > > > > > > Now enforce_cache_coherency and msi_cookie are kernel-managed > hwpt > > > > > things. > > > > > So, they should be only setup on kernel-managed domains. If the > > > attaching > > > > > domain is a user-managed domain, redirect the hwpt to hwpt->parent > to > > > do > > > > > it correctly. > > > > > > > > > > > > > No redirection. The parent should already have the configuration done > > > > when it's created. It shouldn't be triggered in the nesting path. > > > > > > iommufd_hw_pagetable_enforce_cc() is not only called in alloc(), > > > but also in hwpt_attach/replace() if cc is not enforced by the > > > alloc() because the idev that initiates the hwpt_alloc() might > > > not have idev->enforce_cache_coherency. Only when another idev > > > that has idev->enforce_cache_coherency attaches to the shared > > > hwpt, the cc configuration would be done. > > > > is this a bug already? If the 1st device doesn't have enforce_cc in its > > iommu, setting the snp bit in the hwpt would lead to reserved > > bit violation. > > I suspect there are technically some gaps in the intel driver, yes.. double checked. intel driver is doing right thing now: intel_iommu_enforce_cache_coherency() domain_support_force_snooping() static bool domain_support_force_snooping(struct dmar_domain *domain) { struct device_domain_info *info; bool support = true; assert_spin_locked(&domain->lock); list_for_each_entry(info, &domain->devices, link) { if (!ecap_sc_support(info->iommu->ecap)) { support = false; break; } } return support; } > > > another problem is that intel_iommu_enforce_cache_coherency() > > doesn't update existing entries. It only sets a domain flag to affect > > future mappings. so it means the 2nd idev is also broken. > > This is such a gap, intel driver should not permit that. yes. @Baolu, can you add a fix? > > > The simplest option is to follow vfio type1 i.e. don't mix devices > > with different enforce_cc in one domain. > > This is why I wanted to get rid of this bad mechanism going forward. > > Manually created hwpt should have a manual specification of cc and > then we don't have so many problems. > > It means userspace needs to compute if they want to use CC or not, but > userspace already needs to figure this out since without autodomains > it must create two hwpts manually anyhow. > Now there is no interface reporting enforce_cc to userspace. Actually the user doesn't need to know enforce_cc. As long as there is an attach incompatibility error the user has to create a 2nd hwpt for the to-be-attached device then enforce_cc will be handled automatically in hwpt_alloc. I prefer to removing enforce_cc in attach fn completely then no parent trick in this patch. Just keep it in hwpt_alloc and leave to iommu driver to figure out the attaching compatibility: - attaching a idev with matching enforce_cc as hwpt is always allowed. - attaching a idev w/o enforce_cc to a hwpt with enforce_cc is rejected. - attaching a idev w/ enforce_cc to a hwpt w/o enforce_cc succeeds with hwpt continuing to be w/o enforce_cc. No promotion. above has been guaranteed by intel iommu driver.