On 6/19/20 12:00 PM, Michal Privoznik wrote:
On 6/17/20 1:46 AM, Laine Stump wrote:
What do I think should be done? Good question. Possibly we could:
A) Call virNetDevTapReattachBridge() rather than
virNetDevTapAttachBridge() in virNetDevTapCreateInBridgePort(). This
would eliminate problem (2).
B) Instead of checking if the tap device MAC address matches, just
call virNetDevExists() - if it exists, then skip the RemovePort() -
this eliminates problems (3) and (4). (NB - this would fail if it
turns out that tap device deletion isn't completed synchronously with
qemu process termination!)
C) If we want to make it 100% sound, we need to make "check for
interface existence + removeport" an atomic operation, and mutually
exclusive with virNetDevTapCreate(). This would eliminate problem (1)
I still don't quite understand how there can be a race. I mean, from
system POV, libvirt creates a TAP, plugs it into a bridge (when
starting the first domain). And when shutting it down and starting the
second domain in parallel a new TAP device (with different index and
MAC) is created independent of the first TAP, no?
Yes.
There's no problem with the tap device creation/deletion by itself. The
problem is when that tap device is attached to an OVS switch. A "port"
of the same name as the tap device is created in OVS, and that port
continues to exist after the tap device itself is deleted. If a new tap
device of the same name is created without first deleting the port from
OVS, then the new tap device is automatically connected to the existing
port of the same name (I don't know the exact details of how this is
done, but OVS must have it's claws somewhere in the kernel or udev to
make this happen, as I've seen it (and can reproduce it) myself).
Even that, by itself, isn't a problem. But if you have a tap device
attached to an OVS port, and you delete the tap, that frees up the name
so that the next time someone asks to create a tap device they get that
same name. But if that happens before the old port has been removed from
OVS (which is a separate operation), then the new tap device is attached
to the old switch. In that case, simple requests to attach the new tap
to a different switch (or a Linux bridge) will fail.
We solve that problem when the new tap is being attached to an OVS
switch by prepending "--if-exists del-port blah" to the ovs-vsctl
command that attaches to the new switch, but that doesn't help in the
case that the new tap is being attached to a Linux host bridge.
Performing items (1) and (2) in my list will probably get rid of all bad
behavior in practice, but there is still technically the possibility
that the thread (B) for the new tap device will create that tap device
after the thread (A) for the old tap device checks that the device
doesn't exist (i.e. that a new tap with the same name hasn't yet been
created) but before removing the port (of the same name) from the OVS
switch.
This would all be much simpler if the kernel would put a "FIN WAIT"
state on all tap device names so that they couldn't be re-used for a few
seconds after deletion, but it doesn't, so we have to work with what
we've got.