Re: [RFC] /dev/ioasid uAPI proposal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 04, 2021 at 06:10:51PM +0200, Paolo Bonzini wrote:
> On 04/06/21 18:03, Jason Gunthorpe wrote:
> > On Fri, Jun 04, 2021 at 05:57:19PM +0200, Paolo Bonzini wrote:
> > > I don't want a security proof myself; I want to trust VFIO to make the right
> > > judgment and I'm happy to defer to it (via the KVM-VFIO device).
> > > 
> > > Given how KVM is just a device driver inside Linux, VMs should be a slightly
> > > more roundabout way to do stuff that is accessible to bare metal; not a way
> > > to gain extra privilege.
> > 
> > Okay, fine, lets turn the question on its head then.
> > 
> > VFIO should provide a IOCTL VFIO_EXECUTE_WBINVD so that userspace VFIO
> > application can make use of no-snoop optimizations. The ability of KVM
> > to execute wbinvd should be tied to the ability of that IOCTL to run
> > in a normal process context.
> > 
> > So, under what conditions do we want to allow VFIO to giave a process
> > elevated access to the CPU:
> 
> Ok, I would definitely not want to tie it *only* to CAP_SYS_RAWIO (i.e.
> #2+#3 would be worse than what we have today), but IIUC the proposal (was it
> yours or Kevin's?) was to keep #2 and add #1 with an enable/disable ioctl,
> which then would be on VFIO and not on KVM.  

At the end of the day we need an ioctl with two arguments:
 - The 'security proof' FD (ie /dev/vfio/XX, or /dev/ioasid, or whatever)
 - The KVM FD to control wbinvd support on

Philosophically it doesn't matter too much which subsystem that ioctl
lives, but we have these obnoxious cross module dependencies to
consider.. 

Framing the question, as you have, to be about the process, I think
explains why KVM doesn't really care what is decided, so long as the
process and the VM have equivalent rights.

Alex, how about a more fleshed out suggestion:

 1) When the device is attached to the IOASID via VFIO_ATTACH_IOASID
    it communicates its no-snoop configuration:
     - 0 enable, allow WBINVD
     - 1 automatic disable, block WBINVD if the platform
       IOMMU can police it (what we do today)
     - 2 force disable, do not allow BINVD ever

    vfio_pci may want to take this from an admin configuration knob
    someplace. It allows the admin to customize if they want.

    If we can figure out a way to autodetect 2 from vfio_pci, all the
    better

 2) There is some IOMMU_EXECUTE_WBINVD IOCTL that allows userspace
    to access wbinvd so it can make use of the no snoop optimization.

    wbinvd is allowed when:
      - A device is joined with mode #0
      - A device is joined with mode #1 and the IOMMU cannot block
        no-snoop (today)

 3) The IOASID's don't care about this at all. If IOMMU_EXECUTE_WBINVD
    is blocked and userspace doesn't request to block no-snoop in the
    IOASID then it is a userspace error.

 4) The KVM interface is the very simple enable/disable WBINVD.
    Possessing a FD that can do IOMMU_EXECUTE_WBINVD is required
    to enable WBINVD at KVM.

It is pretty simple from a /dev/ioasid perpsective, covers todays
compat requirement, gives some future option to allow the no-snoop
optimization, and gives a new option for qemu to totally block wbinvd
no matter what.

Jason



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux