Thanks for your detailed analysis, I will remake a patch 发件人: Laine Stump (BTW, this other patch is also trying to solve the same problem: https://www.redhat.com/archives/libvir-list/2020-June/msg00525.html I made comments there earlier, and have learned a bit more since then: https://www.redhat.com/archives/libvir-list/2020-June/msg00634.html On 6/15/20 2:36 PM, Daniel Henrique Barboza wrote: > > > On 6/11/20 6:58 AM, Bingsong Si wrote: >> when shutdown vm, the qemuProcessStop cleanup virtual interface in >> two steps: > > s/when/When > >> 1. qemuProcessKill kill qemu process, and vif disappeared >> 2. ovs-vsctl del-port from the brige >> >> if start a vm in the middle of the two steps, the new vm will reused >> the vif, > > s/if/If > >> but removed from bridge by step 2 >> >> Signed-off-by: Bingsong Si <owen.si@xxxxxxxxx> >> --- >> src/qemu/qemu_process.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c >> index d36088ba98..706248815a 100644 >> --- a/src/qemu/qemu_process.c >> +++ b/src/qemu/qemu_process.c >> @@ -7483,9 +7483,11 @@ void qemuProcessStop(virQEMUDriverPtr driver, >> if (vport->virtPortType == >> VIR_NETDEV_VPORT_PROFILE_MIDONET) { >> ignore_value(virNetDevMidonetUnbindPort(vport)); >> } else if (vport->virtPortType == >> VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH) { >> - ignore_value(virNetDevOpenvswitchRemovePort( >> - virDomainNetGetActualBridgeName(net), >> - net->ifname)); >> + virMacAddr mac; >> + if (virNetDevGetMAC(net->ifname, &mac) < 0 || >> !virMacAddrCmp(&mac, &net->mac)) (Before anything else - virNetDevGetMAC() will actually *log* an error in libvirt's logs if the device isn't found (which will be in nearly 100% of all cases). That will lead to people reporting it as a bug, which gets very time consuming and expensive for anyone providing commercial support for a product that uses libvirt. If it is really necessary to check the MAC address of a device that legitimately may or may not exist, then there will need to be a "Quiet" version of the function that doesn't log any errors.)(Update after thinking about it - I don't think we should be checking the MAC address anyway, as it doesn't reliably differentiate "new" tap from "old" tap). > > Extra space between "||" and "!virMacAddrCmp(.." > > > With these nits fixed: > > > Reviewed-by: Daniel Henrique Barboza <danielhb413@xxxxxxxxx> This patch is attempting to solve a race condition between one guest starting at the same time another is stopping, and the mess that results if the new guest uses the same name for its tap device as the old guest used. For example, lets say libvirt thread A / guest A is doing this: A1) the QEMU process is terminated (and tap device, e.g. "vnet0", is implicitly deleted) either by libvirtd or by some external force (including possibly the QEMU process itself A2) the tap's associated port (also "vnet0") is removed from the OVS switch by libvirt. While libvirt thread B is doing this: B1) a new tap device is created for a new QEMU process. If A1 has already happened, then the kernel will likely give the new tap the same name - "vnet0". B2) the new tap device is attached to an OVS switch (or possibly a Linux host bridge). The problem occurs when if B2 happens before A2, which could result in B2 attaching the new tap to the OVS switch, and then A2 disconnecting it from the switch. So libvirt thinks the new QEMU guest tap is attached to the switch, but it isn't. This patch attempts to eliminate the race by checking, prior to removing "old tap"s port on the switch, that 1) the tap device doesn't exist, or that if it does 2) that the MAC address of the tap device is unchanged. Assuming that the two guests would not use the same MAC address for their tap devices (which is probably the case, but isn't *required* to be true), this does significantly narrow the potential time for a race condition, and in particular makes sure that we never remove a port that hasn't just been "re-added" by the new QEMU. However, this just creates a smaller window for the race, and different problem for the remainder of the time. 1) smaller window - it would still be possible for the following to happen: a) old qemu terminates, tap device "vnet0" is deleted b) libvirt checks MAC address and learns that there is no device "vnet0", so it calls virNetDevOpenvswitchRemovePort(), but before ovs-vsctl can be called... c) libvirt creates a new tap device for new QEMU, kernel names it "vnet0" d) libvirt calls virNetDevOpenvswitchAddPort() and the new tap device "vnet0" to the switch e) the ovs-vsctl from (b) is finally able to run, and removes "vnet0" from the switch. Granted, this is *highly* unlikely, since there is nothing extra between checking MAC address and removing the port, but there is nothing enforcing it. 2) New Problem - I think the testing here was done with two guests who both attached their tap to the same (or another) OVS switch. It's relying on the ability to attach the tap device to a new switch/bridge even if it is already attached to some other switch bridge. Normally ovs-vsctl would refuse to add a port to a switch if a port by that same name was already on any OVS switch. I just looked it up though, and in this case libvirt is able to make this work by including "--if-exists del-port $ifname" in the ovs-vsctl command that *adds* the new port. However, if you have the same situation but the new switch device is instead a Linux bridge, the attempt to attach to the bridge fails. So, if thread "B" has already created the new tap device by the time thread "A" is deciding whether or not to remove the old port, the port won't be removed, and if the new guest "B" is using a Linux host bridge, libvirt will fail to attach the new tap to the bridge. So, a summary of the problems with this patch: 1) The race window is reduced (and may be gone in practical terms), but not guaranteed to be eliminated. 2) For the time during the "previous race window start" and "new race window start" a new problem has been created - if the new guest uses a Linux host bridge, the connection will fail 3) virNetDevGetMAC() will put an error in the system logs if the device doesn't exist (and it almost always will *not* exist, so this will be significant 4) Although it is almost always the case that two guests will not use the same MAC address for their network interfaces, there is nothing preventing it - we shouldn't assume that MAC addresses are unique. I think that check is actually superfluous, since the qemu process has always been terminated by the time we get to that place in the code, so the tap device should have been auto-deleted. What do I think should be done? Good question. Possibly we could: A) Call virNetDevTapReattachBridge() rather than virNetDevTapAttachBridge() in virNetDevTapCreateInBridgePort(). This would eliminate problem (2). B) Instead of checking if the tap device MAC address matches, just call virNetDevExists() - if it exists, then skip the RemovePort() - this eliminates problems (3) and (4). (NB - this would fail if it turns out that tap device deletion isn't completed synchronously with qemu process termination!) C) If we want to make it 100% sound, we need to make "check for interface existence + removeport" an atomic operation, and mutually exclusive with virNetDevTapCreate(). This would eliminate problem (1) > >> + ignore_value(virNetDevOpenvswitchRemovePort( >> + virDomainNetGetActualBridgeName(net), >> + net->ifname)); >> } >> } >> > |