Re: [PATCH v4 2/3] network: Introduce network hooks

Michal Privoznik <mprivozn@xxxxxxxxxx> · Thu, 13 Feb 2014 14:17:05 +0100

On 13.02.2014 12:10, Laine Stump wrote:

On 02/12/2014 08:20 PM, Laine Stump wrote:
It all looks fine (aside from the few small grammar corrections in the
docs. I just want one last confirmation that we don't want both "plug"
and "plugged" events, and in that case ACK.

A few marginally related things came to my mind since I sent that reply,
leading me to rethink my conditional ACK:

1) ====================================================

I looked back through the review comments of previous versions, and see
that I mis-remembered about anyone objecting to both a "plug" and a
"plugged" hook. I wasn't really thinking in terms of the "plug" hook
being able to refuse plugging in an interface, but more that there my be
some operations that must be performed prior to allocating the
tap/macvtap device, and also that it makes it more consistent with the
other hook groups (domains and networks both have "start" "started"
"stopped", so to me it seems consistent for a network interface to have
"plug" "plugged" "unplugged"). Daniel - care to reel in my
over-ambitious spirit here? :-)

Okay, we can ignore the hook return value in (un-)plug case. I don't 
really care.

2) ====================================================

Beyond that, I was thinking about this last night as I fell asleep and
realized that in these plug hooks, there is no simple way to determine
which interface of the domain is being plugged/unplugged - we are
sending the full domain XML and full network XML, but if the domain has
multiple interfaces we can't easily figure out which is the one we're
plugging (and as a matter of fact, due to the way
qemuDomainAttachNetDevice() is organized, the new device being plugged
in will not yet be in the domain XML *at all* at the time the "plugged"
hook is called![*])

So I think we need to add the specific <interface> to the XML sent to
the hook, i.e.:

      <hookData>
        <interface>
          ...
        </interface>
        <network>
          ...
        </network>
        <domain>
          ...
        </domain>
     <hookData>

(Putting the lone <interface> first will make it simpler for luddites
like me to just grep for the first occurence of "<mac address" on stdin
rather than firing up a full-fledged XML parser.)

Well, the hook script is now fed with XML - we can do whatever we want - 
even after this is pushed :)

([*] I'm not certain yet if we should also be modifying
qemuDomainAttachNetDevice() to add the new <interface> to the domain
prior to calling networkAllocateActualDevice(), and then remove it
afterwards if there is a failure, or if it's okay to post-add it after
everything is successful.)

3) ====================================================

ANNNNDDDD.... I've also just realized that it may finally be time to
make public what has up to now been something only stored in the
domain's *status* kept on disk for internal use (and never returned in
virDomainGetXMLDesc(), thus not in the public API or documentation): the
"<actual>" element of an interface, which contains information such as
what type of interface it *really* is, which physdev a macvtap interface
is connected to, and what bandwidth was actually selected based on
merging the <interface> config + any <portgroup> that the interface may
belong to. Alternately, perhaps when we format status for public
consumption, the <actual> element should just be merged up to its parent
<interface>. As an example of what I'm talking about, let's say that you
have the following network:

[A]
   <network>
     <name>testnet</network>
     <forward mode='bridge' dev='eth0'>
       <interface dev='eth0'/>
       <interface dev='eth1'/>
       <interface dev='eth2'/>
     </forward>
     <portgroup name='Bob'>
       <bandwidth>
         <inbound average='1000' peak='5000' floor='200' burst='1024'/>
         <outbound average='128' peak='256' burst='256'/>
       </bandwidth>
     </portgroup>
   </network>

and then you have the following interface in a domain:

[B]
   <interface type='network'>
     <source network='testnet' portgroup='Bob'/>
     <model type='virtio'/>
     <mac address='52:54:00:11:22:33'/>
   </interface>

Once you've allocated the actual device, this is what's stored in the
status file:

[C]
   <interface type='network'>
     <source network='testnet' portgroup='Bob'/>
     <model type='virtio'/>
     <mac address='52:54:00:11:22:33'/>
     <actual type='direct'>
       <source dev='eth1' mode='bridge'/>
       <bandwidth>
         <inbound average='1000' peak='5000' floor='200' burst='1024'/>
         <outbound average='128' peak='256' burst='256'/>
       </bandwidth>
     </actual>
   </interface>

(note that the <actual> element is, as previously stated, *never*
returned from any public API, it is only stored internally so that we
can correctly reconstruct state when libvirtd is restarted)

Now, you may ask why the actual data is stored in a subelement rather
than just storing it merged into <interface> directly, and that would be
a good question! The reason is that if we did that, we would lose
information. Here's what happens when you merge:

[D]
   <interface type='direct'>
     <source dev='eth1' mode='bridge'/>
     <bandwidth>
       <inbound average='1000' peak='5000' floor='200' burst='1024'/>
       <outbound average='128' peak='256' burst='256'/>
     </bandwidth>
     <model type='virtio'/>
     <mac address='52:54:00:11:22:33'/>
   </network>

If we now restarted libvirtd, we would be missing the fact that this
interface was allocated from a network, and that it thus must be
returned to the network when the guest terminates. You might think that
would be okay, since the persistent config contains that information,
BUT if it's a transient domain, then there is no persistent config.

HOWEVER (I know, this is getting much too long!), I didn't want to
publish <actual> in the API, since once that is done, it's done forever,
and I've always hoped to figure out a way around it before someone
really needed the information.

So, where was I? Oh - so now somebody *does* need this "actual"
information, meaning it's time to figure out the correct fix. How about
this (I would do it as a separate patch from your hooks patch series):

1) internally and in the status file, we continue storing the "actual"
data just as we have been.

2) externally (both in the virDomainGetXMLDesc() API and in the network
and domain hooks) we "merge" the <actual> element up to <interface>
level, as I did in [D] above.

This would mean that those looking externally at status (vir
virDomainGetXMLDesc() or a hook script) wouldn't see that an interface
was really allocated from a network or that the bandwidth came from a
portgroup, but it would have the advantage of giving them exact info
about what resources the guest is using (and more importantly, that
information would always be in the same place, regardless of whether the
interface was setup directly with <interface
type='direct|bridge|hostdev'> or indirectly with <interface type='network'>

I suppose we could go beyond that and add some tags to indicate the
origins of the items in question, but I don't know if that's necessary
(and, again, once we put it in, we would have to keep it, even if we
later decided it was a bad idea).

Your approach seems reasonable. But then again, it's XML what we are 
passing to the hook script. <domain/>, <network/> even <hookData/> can 
be extended anytime. We can do that later if we please. Or if somebody 
turns out requesting such behavior.

Frankly, with every round of the patches it's getting more and more 
complicated. I'm surely in for doing the right thing, but we can do the 
necessary now and enhance later if somebody with real use case shows up.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list