Re: [PATCH] Enable non page boundary BAR device assignment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 10, 2009 at 07:16:04AM +0200, Muli Ben-Yehuda wrote:
> On Wed, Dec 09, 2009 at 06:38:54PM +0100, Alexander Graf wrote:
> 
> > While trying to get device passthrough working with an emulex hba,
> > kvm refused to pass it through because it has a BAR of 256 bytes:
> >
> >         Region 0: Memory at d2100000 (64-bit, non-prefetchable) [size=4K]
> >         Region 2: Memory at d2101000 (64-bit, non-prefetchable) [size=256]
> >         Region 4: I/O ports at b100 [size=256]
> >
> > Since the page boundary is an arbitrary optimization to allow 1:1
> > mapping of physical to virtual addresses, we can still take the old
> > MMIO callback route.
> >
> > So let's add a second code path that allows for size & 0xFFF != 0
> > sized regions by looping it through userspace.
> 
> That makes sense in general *but* the 4K-aligned check isn't just an
> optimization, it also has a security implication. Consider the
> theoretical case where has a multi-function device has BARs for two
> functions on the same page (within a 4K boundary), and each function
> is assigned to a different guest. With your current patch both guests
> will be able to write to each other's BARs. Another case is where a
> device has a bug and you must not write beyond the BAR or Bad Things
> Happen. With this patch an *unprivileged* guest could exploit that bug
> and make bad things happen.
> 
> This can be fixed if the slow userspace mmio path checks that all MMIO
> accesses by a guest fall within the portion of the page that is
> assigned to it.

This patch seems to implement range checks correctly,
let me know if I am missing something.

One also notes that we currently link qemu with libpci
which I think requires admin cap to work.
However, in the future we might extend this to
also support getting device fds over a unix socket
from a higher priviledged process.

If or when this is done, we will have to be
extra careful when passing
device file descriptor to an unpriveledged qemu process if
the BARs are less than full page in size: mapping
such BAR will allow qemu access outside this BAR.

A possible solution to this problem
if/when it arises would be adding yet another sysfs file
for each resource, which would allow read/write but not
mmap access, and perform range checks in the kernel.


> Cheers,
> Muli
> -- 
> Muli Ben-Yehuda | muli@xxxxxxxxxx | +972-4-8281080
> Manager, Virtualization and Systems Architecture
> Master Inventor, IBM Research -- Haifa
> Second Workshop on I/O Virtualization (WIOV '10):
> http://sysrun.haifa.il.ibm.com/hrl/wiov2010/
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux