HI Jan, The spapr-vfio-pci-host-bridge controller is being reworked in qemu-ppc to handle all the iommu groups generically(i.e iommu group need not be mentioned in the CLI). https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-February/124922.html This would help avoiding the sysfs scanning and allows further simplification of libvirt patches. I will post my patches post testing once the qemu changes are merged. Thanks, Shiva On Tue, Dec 9, 2014 at 3:21 PM, Shivaprasad bhat <shivaprasadbhat@xxxxxxxxx> wrote: > On Mon, Dec 8, 2014 at 9:01 PM, Ján Tomko <jtomko@xxxxxxxxxx> wrote: >> This has a qemu: prefix, even though the parsing is done in conf: >> >> It would look nicer split into several commits: >> 1) parsing the new controller type (without auto-adding them) + qemuxml2xml >> test of the controllers >> 2) auto-adding the controllers >> 3) adding the 0 domain parameter to many functions (if needed) >> > > I added the prefix to mention this is for qemu only. With the > splitting that you are > suggesting, I think I can be more specific. > > >> On 11/17/2014 11:04 AM, Shivaprasad G Bhat wrote: >>> This patch will get the iommu group for host devices by XML configuration >>> of the vfio host bridge controller or from the sysfs. >>> >>> On pseries, the iommu group can be shared by multiple host devices and >>> they would share the same spapr-vfio-host-bus controller. A new >>> controller is added for every new iommu group. Every spapr-vfio-host-bridge >>> in the cli creates a new pci domain in the guest. For Example, >>> -device spapr-pci-vfio-host-bridge,iommu=1,id=SOMEDOMAIN,index=1 >>> The "SOMEDOMAIN" is the id for new pci domain inside guest. >>> >>> spapr-pci-vfio-host-bridge is actually a PCI host bridge with VFIO support. >>> It hosts pci-bridges, and has all the features/behaviours similar to the >>> default emulated pci host bridge. The controller can host both pci and pcie >>> devices. For convenience, this root controller uses model "pci-root". >>> >>> The sample controller tags would look like below: >>> <controller type='spapr-vfio-pci' index='0' model='pci-root' >>> iommuGroupNum='3' domain='1'/> >>> <controller type='spapr-vfio-pci' index='1' model='pci-bridge' >>> iommuGroupNum='3' domain='1'> >> >> Would it make sense to fill out the iommuGroup on domain startup, depending on >> the devices that were assigned there? >> >> So that libvirt user can do: >> <controller type='spapr-vfio-pci' domain='1'/> >> <controller type='spapr-vfio-pci' domain='2'/> >> >> Then attach two hostdevs to these domains without knowing their IOMMU group? >> >> (Although the user needs to know which devices are in the same group to attach >> more than one to the same domain) > > I have added the iommu group to the controller to have 1:1 mapping of > domain to iommu. This helps in validating the Addresses to say if a > device is assigned to the correct domain address. (patch 3 does this). > > As you suggested below in one of the comments, If the iommu group detection be > delayed till the process start, I think we can do away with iommu group > in the controller xml. I'll see how I can avoid this in v3. > >> >>> <address type='pci' domain='0x0001' bus='0x00' slot='0x02' >>> function='0x0'/> >>> </controller> >>> <controller type='spapr-vfio-pci' index='0' model='pci-root' >>> iommuGroupNum='13' domain='2'/> >>> >>> Like other architectures, unassigned and unmanaged devices in the iommu group >>> need to be detached manually before the guest is created . >>> >>> The spapr-vfio-pci controllers are removed when there are no corresponding >>> hostdevices in xml. >>> >>> Signed-off-by: Shivaprasad G Bhat <sbhat@xxxxxxxxxxxxxxxxxx> >>> Signed-off-by: Pradipta Kumar Banerjee <bpradip@xxxxxxxxxx> >>> Reviewed-by: Prerna Saxena <prerna@xxxxxxxxxxxxxxxxxx> >>> --- >>> docs/schemas/domaincommon.rng | 28 +++++++ >>> src/bhyve/bhyve_domain.c | 2 >>> src/conf/domain_conf.c | 165 +++++++++++++++++++++++++++++++++++++++-- >>> src/conf/domain_conf.h | 19 +++++ >>> src/libvirt_private.syms | 1 >>> src/qemu/qemu_command.c | 4 + >>> src/qemu/qemu_domain.c | 12 +-- >>> src/qemu/qemu_driver.c | 6 + >>> 8 files changed, 222 insertions(+), 15 deletions(-) >>> >> >> >>> @@ -12113,6 +12165,95 @@ virDomainResourceDefParse(xmlNodePtr node, >>> return NULL; >>> } >>> >>> +int >>> +virDomainDefMaybeAddHostdevSpaprPCIVfioControllers(virDomainDefPtr def) >>> +{ >>> + size_t i, j; >>> + virDomainHostdevDefPtr hostdev; >>> + virDomainControllerDefPtr controller; >>> + int ret = -1; >>> + int maxDomainId = 0; >>> + int skip; >>> + >>> + if (!(ARCH_IS_PPC64(def->os.arch)) || >>> + !(def->os.machine && STRPREFIX(def->os.machine, "pseries"))) >>> + return 0; >>> + >>> + for (i = 0; i < def->nhostdevs; i++) { >>> + hostdev = def->hostdevs[i]; >>> + if (IS_PCI_VFIO_HOSTDEV(hostdev)) >>> + hostdev->source.subsys.u.pci.iommu = -1; >>> + } >>> + /* The hostdevs belonging to same iommu are >>> + * all part of same domain. >>> + */ >>> + for (i = 0; i < def->ncontrollers; i++) { >>> + controller = def->controllers[i]; >>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO && >>> + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) >>> + for (j = 0; j < def->nhostdevs; j++) { >>> + hostdev = def->hostdevs[j]; >>> + if (IS_PCI_VFIO_HOSTDEV(hostdev)) >>> + if (hostdev->info->addr.pci.domain == controller->domain) >>> + hostdev->source.subsys.u.pci.iommu = controller->opts.spaprvfio.iommuGroupNum; >>> + } >>> + if (controller->domain > maxDomainId) >>> + maxDomainId = controller->domain; >>> + } >>> + /* If the spapr-vfio controller doesnt exist for the hostdev >>> + * add a controller for that iommu group. >>> + */ >>> + for (i = 0; i < def->nhostdevs; i++) { >>> + skip = 0; >>> + hostdev = def->hostdevs[i]; >>> + if (IS_PCI_VFIO_HOSTDEV(hostdev)) { >>> + virPCIDeviceAddressPtr addr; >>> + int iommu = -1; >>> + if (hostdev->source.subsys.u.pci.iommu == -1) { >>> + addr = (virPCIDeviceAddressPtr)&hostdev->source.subsys.u.pci.addr; >>> + if ((iommu = virPCIDeviceAddressGetIOMMUGroupNum(addr)) < 0) >>> + goto error; >> >> I don't know if we should access /sysfs on XML parsing. Could we add the >> controllers at startup instead? > > Are you saying we auto-add the controller around, qemuProcessStart OR > qemuPrepareHostDevices() > function calls? The normal domxml-to-native would still need these > controllers right ? > >> >>> + hostdev->source.subsys.u.pci.iommu = iommu; >>> + >>> + for (j = 0; j < def->ncontrollers; j++) { >>> + controller = def->controllers[j]; >>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO && >>> + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) { >>> + if (iommu == controller->opts.spaprvfio.iommuGroupNum) >>> + skip = 1; >>> + } >>> + } >>> + if (skip) >>> + continue; >>> + if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO, >>> + ++maxDomainId, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) >>> + goto error; >>> + def->controllers[def->ncontrollers-1]->opts.spaprvfio.iommuGroupNum = iommu; >>> + } >>> + } >>> + } >>> + >>> + /* Remove redundant controllers which dont serve any hostdevs. */ >> >> We don't delete unused controllers of other types, what if the user explicitly >> requests them? > > If there are stale controllers, the iommu group which they serve would > get busy, as the > guest would start using them even though there are no hostdevs from > that group. Since > we auto-add them, I felt its better to remove them if unnecessary. > >> >> Jan >> >>> + for (i = 0; i < def->ncontrollers; i++) { >>> + controller = def->controllers[i]; >>> + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { >>> + int usedController = 0; >>> + for (j = 0; j < def->nhostdevs; j++) { >>> + hostdev = def->hostdevs[j]; >>> + if (IS_PCI_VFIO_HOSTDEV(hostdev)) >>> + if (hostdev->source.subsys.u.pci.iommu == controller->opts.spaprvfio.iommuGroupNum) >>> + usedController = 1; >>> + } >> >> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list