On Fri, 2017-08-25 at 12:12 +0200, Peter Krempa wrote: > > > @@ -1012,6 +1012,18 @@ qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def, > > > tmp = virPCIDeviceAddressGetIOMMUGroupNum(hostAddr); > > > > > > if (tmp < 0) { > > > + /* If there's already a PCI address assigned to the device > > > + * we move on instead of erroring out. > > > + * > > > + * This means we still don't allow non-existing host > > > + * devices to be assigned to guests; however, if the host > > > + * device existed at some point in the past but no longer > > > + * does, which can happen very easily when dealing with VFs, > > > + * we prevent the guest from disappearing and give the user > > > + * an opportunity to edit its configuration */ > > > + if (virDeviceInfoPCIAddressPresent(info)) > > > + goto skip; > > > + > > > virReportError(VIR_ERR_INTERNAL_ERROR, > > > _("Can't look up isolation group for host device " > > > "%04x:%02x:%02x.%x"), > > Why on earth is this in the domain address callback? That means that > it's filled on parse. That's silly for a thing like the isolation group. > > The isolation group is necessary only when you are going to start the VM > so only then it should be determined. That would prevent a bug like this > altogether. > > Also there's PCI hotplug... Well, the whole point of introducing isolation groups was so that isolation constraint would be respected when assigning PCI addresses, and guest startup is way too late for that. If host devices are not isolated correctly on pSeries guests, depending on the situation the result can be either degraded error handling (undesirable, but maybe not too bad) or the guest failing to start (pretty bad). So picking the right PCI addresses is critical, and we need to know the isolation group in order to do it. Note that these constraints only apply to pSeries, and accordingly the code above only runs for such guests. That said, I hadn't realized that configuring guests to use devices that are not yet or no longer available on the host was supposed to work, but since that's the case the current behavior is basically a regression because it prevents the users from doing something that they would have been able to do in the past. I'll post a v2 that changes the behavior so that we fall back to the old ways (no isolation) for devices that are not available on the host at the time the guest is configured. Thanks for pointing out my mistake :) -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list