On Wed, Nov 09, 2022 at 11:28:22AM -0700, Alex Williamson wrote: > > > > I'd be much more comfortable with this as a system wide iommufd flag > > > > if we also tied it to do some demonstration of privilege - eg a > > > > requirement to open iommufd with CAP_SYS_RAWIO for instance. > > > > > > Which is not compatible to existing use cases, which is also why we > > > can't invent some way to allow some applications to run without CPU > > > mitigations, while requiring it for others as a baseline. > > > > Isn't it? Didn't we learn that libvirt runs as root and will open and > > pass the iommufd as root? > > We're jumping ahead to native iommufd support here, what happens when > VFIO_CONTAINER=n and it's QEMU opening the fds, with only file access > privileges? Yes, but I am thinking aloud about how to best to do this in native iommufd modes. > > I think so. At least you should have something to shut down an > > insecure feature in kernel lockdown modes. CAP_SYS_RAWIO is a simple > > way to do it. > > How are CPU vulnerabilities handled in lockdown mode, do apps require > certain capabilities to run fast vs safe, or do we simply disallow > unsafe globally in lockdown? I think we have a lot more leniency to > ignore/disallow flags that enable global insecurities when any sort of > lockdown is imposed. The CPU things are all information leaks from the kernel to userspace. lockdown is about preserving kernel operating integrity - eg preventing modification of hijacking of the running kernel. So, like you say below, this is kind of in between, it is not information leakage, and it is is hopefully not an integrity issue. Being more of a DOS maybe it is fine under the lockdown scenarios. At least I am happier to hear that. > > vfio-iommufd seems like overkill, I think your first suggestion to put > > in vfio.ko was more practical. > > Convenient perhaps, but architecturally the wrong place for it. Ah, that is pretty subjective. If the architecture is that the iommufd user subsystem opts-in to this insecurity then it is an OK place If it is that iommufd sets it globaly for the whole system it is the wrong place. We could also talk about a per-vfio_device sysfs to control this? Then we can make the sysfs only appear for vfio_devices using the iommu_domain part of iommufd/vfio. That has a nice sort of compat shape as we can make the existing module option default the sysfs to insecure Jason