On 10/07/13 16:10, Laine Stump wrote: > On 10/04/2013 08:55 AM, Peter Krempa wrote: >> Add code to check availability of PCI passhthrough using VFIO and the >> legacy KVM passthrough and use it when starting VMs and hotplugging >> devices to live machine. >> --- >> src/qemu/qemu_hostdev.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++++ >> src/qemu/qemu_hostdev.h | 4 ++ >> src/qemu/qemu_hotplug.c | 4 ++ >> src/qemu/qemu_process.c | 5 ++ >> 4 files changed, 139 insertions(+) >> >> diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c >> index 21fe47f..dbbc2b4 100644 >> --- a/src/qemu/qemu_hostdev.c >> +++ b/src/qemu/qemu_hostdev.c ... >> @@ -1287,3 +1293,123 @@ void qemuDomainReAttachHostDevices(virQEMUDriverPtr driver, >> qemuDomainReAttachHostScsiDevices(driver, def->name, def->hostdevs, >> def->nhostdevs); >> } >> + >> + >> +static bool >> +qemuHostdevHostSupportsPassthroughVFIO(void) >> +{ >> + DIR *iommuDir = NULL; >> + struct dirent *iommuGroup = NULL; >> + bool ret = false; >> + >> + /* condition 1 - /sys/kernel/iommu_groups/ contains entries */ >> + if (!(iommuDir = opendir("/sys/kernel/iommu_groups/"))) >> + goto cleanup; >> + >> + while ((iommuGroup = readdir(iommuDir))) { >> + /* skip ./ ../ */ >> + if (STRPREFIX(iommuGroup->d_name, ".")) >> + continue; >> + >> + /* assume we found a group */ >> + break; >> + } >> + >> + if (!iommuGroup) >> + goto cleanup; >> + /* okay, iommu is on and recognizes groups */ >> + >> + /* condition 2 - /dev/vfio/vfio exists */ >> + if (!virFileExists("/dev/vfio/vfio")) >> + goto cleanup; > > > Was this all that was suggested? At any rate it's a good start even if > there are other checks that can be added later. Yes that was all. While I was testing this I found out, that this doesn't necesarily mean everything will go well. For example on crappy hardware you might need to enable unsafe interrupt routing (which may kill your host in most cases). These are then caught by the improved error reporting code in the qemu driver so you get a more useful error message. (saying that operation was not permitted and few entries in the kernel log). > > >> + ... >> +bool >> +qemuHostdevHostVerifySupport(virDomainHostdevDefPtr *hostdevs, >> + size_t nhostdevs) >> +{ >> + int supportsPassthroughKVM = -1; >> + int supportsPassthroughVFIO = -1; >> + size_t i; >> + >> + /* assign defaults for hostdev passthrough */ >> + for (i = 0; i < nhostdevs; i++) { >> + virDomainHostdevDefPtr hostdev = hostdevs[i]; >> + >> + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && >> + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { >> + int *backend = &hostdev->source.subsys.u.pci.backend; >> + >> + /* cache host state of passthrough support */ >> + if (supportsPassthroughKVM == -1 || supportsPassthroughVFIO == -1) { >> + supportsPassthroughKVM = qemuHostdevHostSupportsPassthroughLegacy(); >> + supportsPassthroughVFIO = qemuHostdevHostSupportsPassthroughVFIO(); > > > These should be checked once at the top of the function and their return > values cached. Hmm; I originally didn't want to call the code in case no PCI passthrough device was present in the guest. The main reason was that the first version was erroring out in the functions directly. I think we can safely call those every time we see non-zero count of hostdevs. I will adapt the code in the next version. > > >> + } >> + >> + switch ((virDomainHostdevSubsysPciBackendType) *backend) { >> + case VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO: >> + if (!supportsPassthroughVFIO) { >> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", >> + _("host doesn't support VFIO PCI passthrough")); >> + return false; >> + } >> + break; >> + ... >> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c >> index 7a30a5e..4aa9582 100644 >> --- a/src/qemu/qemu_process.c >> +++ b/src/qemu/qemu_process.c >> @@ -3567,6 +3567,11 @@ int qemuProcessStart(virConnectPtr conn, >> goto cleanup; >> } >> >> + /* check and assign device assignment settings */ >> + if (!qemuHostdevHostVerifySupport(vm->def->hostdevs, >> + vm->def->nhostdevs)) >> + goto cleanup; >> + > > In one case you're calling this very near to a call to > qemuPrepareHostdevPCIDevices(), in the other very near to a call to > qemuPrepareHostDevices, which itself calls > qemuPrepareHostdevPCIDevices(); I think this check should be made inside > qemuPrepareHostdevPCIDevices() so that the higher level code isn't > polluted with the detail. Seems like that will be an okay place too. At first I was looking for a place that won't be called when running the tests, but qemuPrepareHostdevPCIDevices() seems like the right place to do it. I will adapt the code and re-send. Peter
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list