Re: [PATCH v5 13/19] iommufd: Add kAPI toward external drivers for physical devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 27, 2022 at 10:13:55PM +0100, Eric Auger wrote:
> > +static int iommufd_device_setup_msi(struct iommufd_device *idev,
> > +				    struct iommufd_hw_pagetable *hwpt,
> > +				    phys_addr_t sw_msi_start,
> > +				    unsigned int flags)
> > +{
> > +	int rc;
> > +
> > +	/*
> > +	 * IOMMU_CAP_INTR_REMAP means that the platform is isolating MSI, and it
> rather means that the *IOMMU* implements IRQ remapping.

Not really. The name is a mess, but as it is implemented, it means the
platform is implementing MSI security. How exactly that is done is not
really defined, and it doesn't really belong as an iommu property.
However the security is being created is done in a way that is
transparent to the iommu_domain user.

MSI security means that the originating device (eg the RID for PCI) is
validated when the MSI is processed. RIDs can only use MSIs they are
authorized to use.

It doesn't matter how it is done, if it remapping HW, fancy
iommu_domain tricks, or built into the MSI controller. Set this flag
if the platform is secure and doesn't need the code triggered by
irq_domain_check_msi_remap().

> irq_domain_check_msi_remap() instead means the MSI controller
> implements that functionality (a given device id is able to trigger

Not quite, it means that MSI isolation is available, however it is not
transparent and the iommu_domain user must do the little dance that
follows.

It does not mean the MSI controller implements the security
functionality because it is not set on x86.

Logically on x86 we should consider the remapping logic as part of the
MSI controller even if x86 makes them separated for historical
reasons.

> MSI #n and this #n gets translated into actual MSI #m) So what you
> want is to prevent an assigned device from being able to DMA into an
> MSI doorbell that is not protected by either the IOMMU or the MSI
> controller. If this happens this means the DMA can generate any kind
> of MSI traffic that can jeopardize the host or other VMs

I don't know of any real systems that work like this. ARM standard GIC
uses a shared ITS page, the IOMMU does nothing to provide MSI
security. MSI security is built into the GIC because it validates the
device ID as part of processing the MSI. The IOMMU is only involved
to grant access to the shared ITS page.

Intel is similar, it provides security through the MSI controller's
remapping logic, the only difference is that on Intel the MSI window
is always present in every iommu_domain (becuase it occures before the
IOMMU), and in ARM you have to do the little dance.

Even if the iommu is to be involved it is all hidden from this layer.

> and afterwards resv_msi is checked to see if we need to create the
> so-called msi cookie.  This msi cookie tells the MSI writes are
> translated by the IOMMU and somebody must create a mapping for those
> transactions.

The cookie is always necessary if we are going to use the
non-transparent mode. That is what makes it the non transparent
mode. We have to supply the reserved IOVA range from one part of the
iommu driver to another part.

> > +	 * creates the MSI window by default in the iommu domain. Nothing
> > +	 * further to do.
> > +	 */
> > +	if (device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP))
> > +		return 0;
> > +
> > +	/*
> > +	 * On ARM systems that set the global IRQ_DOMAIN_FLAG_MSI_REMAP every
> > +	 * allocated iommu_domain will block interrupts by default and this
> It sounds there is a confusion between IRQ remapping and the fact MSI
> writes are not bypassed by the IOMMU.

I don't think I'm confused :)

> > +static int iommufd_device_do_attach(struct iommufd_device *idev,
> > +				    struct iommufd_hw_pagetable *hwpt,
> > +				    unsigned int flags)
> > +{
> > +	phys_addr_t sw_msi_start = 0;
> > +	int rc;
> > +
> > +	mutex_lock(&hwpt->devices_lock);
> > +
> > +	/*
> > +	 * Try to upgrade the domain we have, it is an iommu driver bug to
> > +	 * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
> > +	 * enforce_cache_coherency when there are no devices attached to the
> > +	 * domain.
> > +	 */
> > +	if (idev->enforce_cache_coherency && !hwpt->enforce_cache_coherency) {
> > +		if (hwpt->domain->ops->enforce_cache_coherency)
> > +			hwpt->enforce_cache_coherency =
> > +				hwpt->domain->ops->enforce_cache_coherency(
> > +					hwpt->domain);
> > +		if (!hwpt->enforce_cache_coherency) {
> > +			WARN_ON(list_empty(&hwpt->devices));
> > +			rc = -EINVAL;
> > +			goto out_unlock;
> > +		}
> > +	}
> > +
> > +	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
> > +						   idev->group, &sw_msi_start);
> > +	if (rc)
> > +		goto out_unlock;
> > +
> so in the case of any IOMMU_RESV_MSI, iommufd_device_setup_msi() will be
> called with *sw_msi_start = 0 which will return -EPERM?
> I don't think this is what we want. In that case I think we want the
> RESV_MSI region to be taken into account as a RESV region but we don't
> need the MSI cookie.

This was sort of sloppy copied from VFIO - we should just delete
it. The is no driver that sets both, and once the platform asserts
irq_domain_check_msi_remap() it is going down the non-transparent path
anyhow and must set a cookie to work. [again the names doesn't make
any sense for the functionality]

Failing with EPERM is probably not so bad since the platform is using
an invalid configuration. I'm kind of inclined to leave this for right
now since it has all be tested and I don't want to make a
regression. But I can try to send a patch to tidy it a bit more, maybe
add an appropriate WARN_ON.

The whole thing is actually backwards. The IOMMU_CAP_INTR_REMAP should
have been some global irq_has_secure_msi() and everything with
interrupt remapping, and ARM should set it.

Then the domain should have had a domain cap
IOMUU_CAP_REQUIRE_MSI_WINDOW_SETUP to do the little dance ARM drivers
need.

And even better would have been to not have the little dance in the
first place, as we don't really need the iommu_domain user to convey
the fixed MSI window from one part of the iommu driver to another.

We may try to fix this up when we get to doing nesting on ARM, or
maybe just leave the confusing sort of working thing as-is. I don't
know.

Jason



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux