v3: Quite a bit of restructuring, functional differences include exposing another fw_cfg file to indicate the size of the stolen memory region. We really have no need to copy anything into stolen memory, so while we tell SeaBIOS about it via a fw_cfg file, the data pointer is NULL so it can't be read. I'm also now reading the GGMS size from the GMCH register which determines the size of the GTT region of stolen memory. The vBIOS is typically only using 1MB, but this is often 2MB in hardware. I also give the user the ability to specify a GMS value for further stolen memory. We default it to zero and it's an experimental option so we can remove if it's not useful. QEMU now does the virtualization of the GMCH and BDSM registers, which is was sort of doing before anyway, but vfio kernel no longer does anything special for them. Getting the GGMS size requires that we know something about the IGD version we're using, so code added for that. One fun thing, IGD is really part of the reason that the x-vga option is experimental. IGD doesn't like to give up VGA routing. Now we can use that to our advantage. If the hardware doesn't report VGA disabled, we can automatically turn it on. I position this around all the other stuff of doing vBIOS and BDSM quirks, so it can be disabled by specifying rombar=0 or using the new x-no-auto-vga options. I'm also using this to signal when to skip creating the ISA bridge and messing with the host bridge. If we have a Gen8 or newer and rombar=0 is specified then we don't do any special setup, which should enable Intel's Universal Passthrough Mode. This is already supported by libvirt, so it should make an easy path between old and new modes. Not seen here is a whole revision that created fake BARs on the ISA bride for the opregion and stolen memory such that they were automatically mapped with no BIOS requirement. That has a gap that stolen memory gets disabled during BAR sizing and breaks altogether if the guest moves the BAR. It really only affects VESA mode, but it's still enough to abandon that hack approach. This will work with the previous kernel patches, but I'd recommend v3 anyway, plus the PCI FLR reset delay on latops. You will definitely need new SeaBIOS for this or else you'll get a hw_error. Happy testing, reviews and feedback welcome. Thanks, Alex v2: IGD support is greatly expanded. Due to feedback on the previous serious QEMU no longer maps the OpRegion to the guest, we simply fill a buffer and expose it as fw_cfg. We could still do the mapping in the future if there's value to it. New features include the use of host and LPC bridge config space provided through new vfio device specific regions. This eliminates the need for QEMU to go poking around in pci-sysfs. Additionally the host and LPC changes are now initiated by vfio-pci upon finding the necessary regions to support these. Thus the igd_passthru=on machine option is not needed for this series. This series no longer has any dependency on Gerd's previous IGD series. Also included is PCI option ROM fixups, which automatically fixes the device ID in the ROM and recalculates the checksum for ROMs loaded through vfio. This is necessary for IGD as the ROM vfio provides us through the shadow ROM space typically has the wrong ID and bogus checksum. It would also be useful for anyone "soft modding" a card by specifying a different device ID and manually hacking the ROM. Finally is a quirk to handle stolen memory and requires cooperation with SeaBIOS. We need the vBIOS, as enabled by the ROM support above, for lighting up laptop panels (at least for my SNB system), but that vBIOS tries to make use of host stolen memory, which either overlaps VM RAM or empty space, which leads to VM memory corruption or DMAR faults respectively. We can prevent this by intercepting the vBIOS programming of the device to instead use a buffer allocated by SeaBIOS. I'm amazed this works, but it does... at least for me. Comments and testing feedback welcome. You'll need this QEMU patch series, the latest vfio patch series (including the PCI reset path on laptops), and a new SeaBIOS patch series. Thanks, Alex v1: This is the QEMU compliment to the vfio kernel capability chain series. This is RFC since it depends on those non-upstream kernel changes. Patch 1/ will be posted separately, it's somewhat unrelated, but is in my build tree so I include it here for anyone that wants to build this series. This series includes sparse mmap support for avoiding mmaps over the MSI-X vector table and device specific memory regions for IGD OpRegion support. MemoryRegions are significantly generalize for the former, to make it really easy for each vfio region to be backed by none or more mmap MemoryRegion. The MSI-X vector table then either adds an mmap region, or not via a legacy quirk or explicit sparse mmap support. IGD OpRegions are exposed as new device specific region, which simply entails searching regions past those known for matching type and sub-type regions that we know how to handle. Writes to the OpRegion register (ASL storage) pop the host OpRegion into VM system memory. This isn't exactly like how real hardware works, but it makes for a convenient implementation. Alternatively we could pass the entire OpRegion table via fw_cfg, but this makes write through to the host impossible (if that's even useful). This is certainly something that I'm looking for comments about in this series. Thanks, Alex --- Alex Williamson (9): vfio: Add sysfsdev property for pci & platform vfio: Wrap VFIO_DEVICE_GET_REGION_INFO vfio: Generalize region support vfio/pci: Convert all MemoryRegion to dynamic alloc and consistent functions linux-headers/vfio: Update for proposed capabilities list vfio: Enable sparse mmap capability vfio/pci: Fixup PCI option ROMs vfio/pci: Split out VGA setup Intel IGD support for vfio hw/arm/sysbus-fdt.c | 2 hw/vfio/common.c | 249 +++++++++++++++-- hw/vfio/pci-quirks.c | 538 +++++++++++++++++++++++++++++++++++-- hw/vfio/pci.c | 593 +++++++++++++++++++++++------------------ hw/vfio/pci.h | 21 + hw/vfio/platform.c | 126 +++------ include/hw/vfio/vfio-common.h | 29 ++ linux-headers/linux/vfio.h | 101 +++++++ trace-events | 19 + 9 files changed, 1262 insertions(+), 416 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html