On Thu, 20 Apr 2023 17:55:22 +0100 Robin Murphy <robin.murphy@xxxxxxx> wrote: > On 20/04/2023 3:49 pm, Alex Williamson wrote: > > On Thu, 20 Apr 2023 15:19:55 +0100 > > Robin Murphy <robin.murphy@xxxxxxx> wrote: > > > >> On 2023-04-20 15:15, Alex Williamson wrote: > >>> On Thu, 20 Apr 2023 06:52:01 +0000 > >>> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > >>> > >>>> Hi, Alex, > >>>> > >>>> Happen to see that we may have inconsistent policy about RMRR devices cross > >>>> different vendors. > >>>> > >>>> Previously only Intel supports RMRR. Now both AMD/ARM have similar thing, > >>>> AMD IVMD and ARM RMR. > >>> > >>> Any similar requirement imposed by system firmware that the operating > >>> system must perpetually maintain a specific IOVA mapping for the device > >>> should impose similar restrictions as we've implemented for VT-d > >>> RMMR[1]. Thanks, > >> > >> Hmm, does that mean that vfio_iommu_resv_exclude() going to the trouble > >> of punching out all the reserved region holes isn't really needed? > > > > While "Reserved Memory Region Reporting", might suggest that the ranges > > are simply excluded, RMRR actually require that specific mappings are > > maintained for ongoing, side-band activity, which is not compatible > > with the ideas that userspace owns the IOVA address space for the > > device or separation of host vs userspace control of the device. Such > > mappings suggest things like system health monitoring where the > > influence of a user-owned device can easily extend to a system-wide > > scope if the user it able to manipulate the device to deny that > > interaction or report bad data. > > > > If these ARM and AMD tables impose similar requirements, we should > > really be restricting devices encumbered by such requirements from > > userspace access as well. Thanks, > > Indeed the primary use-case behind Arm's RMRs was certain devices like > big complex RAID controllers which have already been started by UEFI > firmware at boot and have live in-memory data which needs to be preserved. > > However, my point was more that if it's a VFIO policy that any device > with an IOMMU_RESV_DIRECT reservation is not suitable for userspace > assignment, then vfio_iommu_type1_attach_group() already has everything > it would need to robustly enforce that policy itself. It seems silly to > me for it to expect the IOMMU driver to fail the attach, then go ahead > and dutifully punch out direct regions if it happened not to. A couple > of obvious trivial tweaks and there could be no dependency on driver > behaviour at all, other than correctly reporting resv_regions to begin with. > > If we think this policy deserves to go beyond VFIO and userspace, and > it's reasonable that such devices should never be allowed to attach to > any other kind of kernel-owned unmanaged domain either, then we can > still trivially enforce that in core IOMMU code. I really see no need > for it to be in drivers at all. It seems like a reasonable choice to me that any mixing of unmanaged domains with IOMMU_RESV_DIRECT could be restricted globally. Do we even have infrastructure for a driver to honor the necessary mapping requirements? It looks pretty easy to do as well, something like this (untested): diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 10db680acaed..521f9a731ce9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2012,11 +2012,29 @@ static void __iommu_group_set_core_domain(struct iommu_group *group) static int __iommu_attach_device(struct iommu_domain *domain, struct device *dev) { - int ret; + int ret = 0; if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; + if (domain->type == IOMMU_DOMAIN_UNMANAGED) { + struct iommu_resv_region *region; + LIST_HEAD(resv_regions); + + iommu_get_resv_regions(dev, &resv_regions); + list_for_each_entry(region, &resv_regions, list) { + if (region->type == IOMMU_RESV_DIRECT) { + ret = -EPERM; + break; + } + } + iommu_put_resv_regions(dev, &resv_regions); + if (ret) { + dev_warn(dev, "Device may not be used with an unmanaged IOMMU domain due to reserved direct mapping requirement.\n"); + return ret; + } + } + ret = domain->ops->attach_dev(domain, dev); if (ret) return ret; Restrictions in either type1 or iommufd would be pretty trivial as well, but centralizing it in core IOMMU code would do a better job of covering all use cases. This effectively makes the VT-d code further down the same path redundant, so no new restrictions there. What sort of fall-out should we expect on ARM or AMD? This was a pretty painful restriction to add on Intel. Thanks, Alex > >>> [1]https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf > >>> > >>>> RMRR identity mapping was considered unsafe (except for USB/GPU) for > >>>> device assignment: > >>>> > >>>> /* > >>>> * There are a couple cases where we need to restrict the functionality of > >>>> * devices associated with RMRRs. The first is when evaluating a device for > >>>> * identity mapping because problems exist when devices are moved in and out > >>>> * of domains and their respective RMRR information is lost. This means that > >>>> * a device with associated RMRRs will never be in a "passthrough" domain. > >>>> * The second is use of the device through the IOMMU API. This interface > >>>> * expects to have full control of the IOVA space for the device. We cannot > >>>> * satisfy both the requirement that RMRR access is maintained and have an > >>>> * unencumbered IOVA space. We also have no ability to quiesce the device's > >>>> * use of the RMRR space or even inform the IOMMU API user of the restriction. > >>>> * We therefore prevent devices associated with an RMRR from participating in > >>>> * the IOMMU API, which eliminates them from device assignment. > >>>> * > >>>> * In both cases, devices which have relaxable RMRRs are not concerned by this > >>>> * restriction. See device_rmrr_is_relaxable comment. > >>>> */ > >>>> static bool device_is_rmrr_locked(struct device *dev) > >>>> { > >>>> if (!device_has_rmrr(dev)) > >>>> return false; > >>>> > >>>> if (device_rmrr_is_relaxable(dev)) > >>>> return false; > >>>> > >>>> return true; > >>>> } > >>>> > >>>> Then non-relaxable RMRR device is rejected when doing attach: > >>>> > >>>> static int intel_iommu_attach_device(struct iommu_domain *domain, > >>>> struct device *dev) > >>>> { > >>>> struct device_domain_info *info = dev_iommu_priv_get(dev); > >>>> int ret; > >>>> > >>>> if (domain->type == IOMMU_DOMAIN_UNMANAGED && > >>>> device_is_rmrr_locked(dev)) { > >>>> dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.\n"); > >>>> return -EPERM; > >>>> } > >>>> ... > >>>> } > >>>> > >>>> But I didn't find the same check in AMD/ARM driver at a glance. > >>>> > >>>> Did I overlook some arch difference which makes RMRR device safe in > >>>> those platforms or is it a gap to be fixed? > >>>> > >>>> Thanks > >>>> Kevin > >>>> > >>> > >> > > >