Re: [PATCH] qemu: match controller index for LIVE+CONFIG when doing hotplug

Martin Kletzander <mkletzan@xxxxxxxxxx> · Fri, 24 Jun 2016 13:25:19 +0200

On Thu, Jun 23, 2016 at 12:34:35PM -0400, Laine Stump wrote:
On 06/23/2016 08:26 AM, Ján Tomko wrote:
On Wed, Jun 22, 2016 at 03:00:47PM -0400, Laine Stump wrote:
An attempt to attach a new scsi controller with both --live and
--config but without specifying an index, e.g.:

 <controller model="virtio-scsi" type="scsi" />

led to this error:

 internal error: Cannot parse controller index -1

This was because unspecified indexes are auto-assigned during
virDomainDefPostParse(), which doesn't happen for hotplugged devices
until after the device has been added to the domainDef, but
qemuDomainAttachFlags() makes a copy of the device (for feeding to
qemuDomainAttachDeviceLive() *before* it's added to the config, and
the copying function actually formats the device object and then
re-parses it into a new object.

Since qemuDomainAttachDeviceConfig() consumes the device object
pointer (i.e. sets it to NULL in the original virDomainDeviceDef) we
can't just wait to make the copy. Instead, we need to make a *shallow*
copy of the virDomainDeviceDef prior to
qemuDomainAttachDeviceConfig(), then make a deep copy of the shallow
copy.

This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1344899
---
src/qemu/qemu_driver.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index fa05046..f3e17e2 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -8199,6 +8199,7 @@ qemuDomainAttachDeviceFlags(virDomainPtr dom,
    virDomainObjPtr vm = NULL;
    virDomainDefPtr vmdef = NULL;
    virDomainDeviceDefPtr dev = NULL, dev_copy = NULL;
+    virDomainDeviceDef dev_shallow;
    int ret = -1;
    unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE |
                               VIR_DOMAIN_DEF_PARSE_ABI_UPDATE;
@@ -8237,22 +8238,19 @@ qemuDomainAttachDeviceFlags(virDomainPtr dom,
    if (dev == NULL)
        goto endjob;

-    if (flags & VIR_DOMAIN_AFFECT_CONFIG &&
-        flags & VIR_DOMAIN_AFFECT_LIVE) {
-        /* If we are affecting both CONFIG and LIVE
-         * create a deep copy of device as adding
-         * to CONFIG takes one instance.
-         */
-        dev_copy = virDomainDeviceDefCopy(dev, vm->def, caps,
driver->xmlopt);
-        if (!dev_copy)
-            goto endjob;
-    }
-
    if (priv->qemuCaps)
        qemuCaps = virObjectRef(priv->qemuCaps);
    else if (!(qemuCaps =
virQEMUCapsCacheLookup(driver->qemuCapsCache, vm->def->emulator)))
        goto endjob;

+    /* Save away the pointer to the device object before it is
+     * potentially swallowed up by qemuDomainAttachDeviceConfig().
+     * This will allow us to make a copy of the device after any
+     * modifications made by virDomainDefPostParse() (which is called
+     * after the new device is added to the config
+     */
+    dev_shallow = *dev;
+

This looks fragile and complicated, and I've already managed to break it
with the XML you provided:

$ virsh attach-device f24 cont --live
Device attached successfully

$ virsh attach-device f24 cont --live --config
error: Failed to attach device from cont
error: operation failed: target scsi:1 already exists

Yeah, I noticed that too just before I posted, and am still thinking
about how to solve it, but I still think the situation is one step
better with this patch in place (although it does have a similar effect
on --live + --config attachment of PCI devices - changing from one
improper behavior to another. Sigh.)

This all comes down to a more general problem that I offhandledly
mentioned in an email with Tomasz Flendrich (the GSoC student working on
"device address abstraction") - other than the live state starting off
as a copy of the config, we make no attempt to maintain consistency of
device attributes between the two, and could easily end up with
conflicting things between the two. Normally this is mostly transparent
to the user, but could cause serious complaints from guest OSes when
they are later restarted and all their nice cozy devices previously
added with "--live --config" are suddenly moved (this may not make sense
now, but there are some further examples below).


Even if the OS doesn't complain, we say we're trying to guarantee that
guest ABI will not change.  Re-plugging some devices causes that (not
USB for example) and we should avoid it.


AFAIK the reason we create a deep copy instead of parsing it again is

But a "deep copy" is just formatting the object and then parsing the
resulting XML again.

our generation of MAC addresses in the parser:
commit 1e0534a770208be6b848c961785db20467deb2fc
   qemu: Don't parse device twice in attach/detach

Well, that's a bit misleading. It *is* parsing twice; it's just that
after the patch, the 2nd parse is done on the result of formatting the
result of the first parse rather than parsing the original xml again.
Because the MAC address is auto-generated as a part of the parsing
(rather than during post-parse), we're able to get a MAC address
consistent between live and persistent without calling
virDomainDefPostParse(). Anything set in the post-parse doesn't have
this same happy existence though.

(side-note - when I made the patch to allow auto-generating the index, I
originally put it directly in the parser rather than post-parse, but
this was (rightfully) NACKed by Cole:

  https://www.redhat.com/archives/libvir-list/2016-May/msg01065.html

Doing it the original way would have avoided the "already exists" error,
but not the more general problem of diverging/conflicting live state and
persistent config.)

So the question is: how should we treat the combination of --live and
--config?

Not just that, but what should we do with --live by itself? Should we
allow someone to hotplug a device --live with attributes that would have
failed if given to --config (i.e. if they specify a PCI address that is
currently unused in the live domain, but is in use in the config)? I
could see a valid use case for this - if the device was added with
--live and the user later wanted to make that device permanent - but
allowing it could also lead to undesired/unexpected device address changes.


it should be possible both ways: --live with one device and --config with
other on the same address, and the other way around as well.  Checking
anything more would make the code way more complex just to tell the user
it's *maybe* not what they wanted.  Don't forget you can boot a totally
different XML than the persistent one with CreateXML.

And what about the current bad behavior when you do this?

  virsh attach interface f24 --type network --source default --live
  virsh attach interface f24 --type network --source default --live
--config


This can be separated into two different issues.  If you do
attach-interface, we generate an XML without address, so you should be
able to do the above and have 2 more interfaces live, the second one
would be identical to the only one added to config.  However if you do
attach-device on a domain with XML that has the address specified, first
with only one of --live/--config and then with both, then the second
call should fail.

Without my patch, the 2nd device ends up with a different PCI address in
the live and persistent object. With my patch, you end up with a similar
error to above (complaining about attempting to use a PCI address that's
already in use). So which evil is worse? Refusing to give a device
different info for live vs. config (arguably not *as bad* for a PCI
address vs. a MAC address or controller index, but still very
undesirable)? Or silently allowing the evil, and damn the consequences?
The latter would just cause some extra bookkeeping in the guest for a
PCI address (the next time you start it, the guest would think the
original device had disappeared and been replaced with another at a
different address, with varying levels of whinging), but if you change
the index of a SCSI controller, you could end up with disks plugged into
the wrong controller (hmm, I guess that's not a horrible thing either,
unless the device name in the guest is based on the controller's
position in the devices, and that device name is used in the OS config
somewhere... Yuck - better to make sure it doesn't change.)


Try to make the persistent device as close as the live one?
Or try to make it match the persistent config (while the live one would
match the live config?)

I think we should make the two identical. For that to be true 100% of
the time, we need to consider both the live and persistent config  any
time we're adding a device to either (or maybe just when we're adding it
to both? I need to think or a minute, and I still haven't made my coffee).


Definitely totally identical or just rollback and return error.

(BTW, we also should consider what happens when you add a scsi disk with
--live --config, attaching it to a controller that was added with just
--live. I guess the controller should be automatically added to the
config, but "dunno, haven't tried".)


I think that controller will be added automatically even if it is not in
any of the definitions.  However the fact that user/mgmt app treated
live and config definitions differently gives us the ability to just
simply handle them differently as well.  We should not try to make
automatic additions way too much automatic IMHO.


Either way, it would be nicer to get the device definition stable
before we even try to add it to the persistent definition.

The problem is that all of the stuff that "stabilizes" the device
definition is done in the post-parse callback, and that requires that
the device already be inserted into the domain definition.


Add the definition in the config, do postparse, check if that definition
can be added into live, repeat with better data.  Rolling back the
addition to config is easy.  I haven't thought it through yet, though.


Also, calling virDomainDefPostParse after device coldplug is strange,
we should be adding a device that does not need ajdustments.

Again, we don't have the information necessary to adjust it until we at
least have the domain def available (and all the post-parse callback
infrastructure assumes/requires that the device already be inserted into
the domain). But if we don't do the *entire* virDomainDefPostParse()
then we are in danger of missing something during a hotplug.

Maybe we need to rethink how we can do the post-parse stuff so that:

1) both the persistent config and (optionally) live state are supplied
to the device callbacks (with assurances that the live state is ignored
if it's NULL)

2) address and index assignment aren't done with a single call in the
domain callback that assigns everything, but instead are done one-by-one
in the device post-parse callback, assuring that newly assigned
addresses aren't in use in either the persistent config or (if present)
the live state. (I think the best way to do this may actually be to
resurrect the idea of the pciaddrs "cache" rather than killing it (as is
currently being discussed)).


I had some ideas in my mind similar to this; it doesn't really matter if
the allocation is extracted from PostParse or not and I think lot of
stuff will change during the implementation =)

And now I find myself all out of words, but without a solution or
sufficient motivation to look for one. Hopefully something will come up
out of the conversation.


Friday, only small amount of coffee, well, it's not the right time to
think this through for me, but we definitely need to take a broader
approach to all the address assignments.

Martin
Attachment:
signature.asc

Description: Digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list