environment:libvirt-4.3.0 qemu-kvm-ev-2.10.0 kernel-3.10.0-1062 centos7 openvswitch-2.3.1vm network xml :
<interface type='bridge'>
<mac address='52:54:00:46:45:95'/>
<source bridge='ovsbr-mgt'/>
<vlan>
<tag id='0'/>
</vlan>
<virtualport type='openvswitch'>
<parameters interfaceid='596c6ab7-4557-4935-af97-62a35d933f8d'/>
</virtualport>
<target dev='vnet0'/>
<model type='virtio'/>
<link state='up'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</interface>
qemuProcessStart in qemu_process.c failed to start.The first is qemu process stop(At this time, the kernel will recycle tap device,and the tap device is applied by other virtual machines).Then, ovs removevport.It is possible to processing concurrently qemuProcessStart and qemuProcessStop.qemuProcessStop(ovs removevport) may remove ports of other virtual machineswhile using openvswitch virtualport.
for example:Failure to start the vm1, the tap device vnet0 will be recovered first(at this time vm2 starts anduses vnet0 device,and ovs add vnet0 port), then the removevport vnet0( remove vnet0belonging to vm2 at this time ). During this time interval,vm2 will apply for the same tap device vnet0 and add port vnet0.At this time, removing the port from vm1 will cause the port of vm2 to be lost.vm2 will not be able to access the network through this vnet0.
reproduce:Batch start or migrate 10 virtual machines to the same node, one of the virtual machines start failed.This failure may be that the storage cannot connect or other failures(when we reproduced internally,one of the virtual machines was connected to an invalid storage, and it was artificially failed).
this problem will cause:After batch migration, the network of a virtual machine cannot be accessed,and the virtual machine service is interrupted
Okay, I understand the problem now, but your patch doesn't fix it.
The problem is (as also described in
https://www.redhat.com/archives/libvir-list/2020-June/msg00481.html
) a race condition created when the qemu process is shutdown just
as a new qemu process is started - since the old tap device is
deleted (and its name made available for re-use) implicitly as a
part of the old qemu process being terminated, and since the old
qemu process has terminated before we remove the port from OVS, a
new tap (with the old name, as the kernel thinks it is now
available) may have already been created by the kernel by the time
qemuProcessStop() gets around to removing the port associated (by
name) with the old tap from the OVS switch.
And we can't eliminate the race by simply moving the call to virNetDevOpenvswitchRemovePort() up before the call to qemuProcessKill() - it is also possible that qemu could have exited by itself, or that some outside force other than libvirt killed it - in this case the tap has already been deleted by the time qemuProcessStop() is reached.
As for your method of eliminating the race, there are two problems:
1) if virNetDevOpenvswitchRemovePort() isn't called, then OVS will automatically grab the new tap device as soon as it is created and re-attach it to the old switch. As long as the new qemu process asks to attach it to that same switch, then there is no problem. But if the new process tries to attach the device to a *different* switch (for example, a Linux host bridge) then the attach will fail.
2) your method of deciding whether or not virNetDevOpenvswitchRemovePort() should be called by libvirt is invalid - the reason isn't always set to VIR_DOMAIN_SHUTOFF_FAILED when the qemu process has been terminated external to libvirt. But beyond that, the code shows that the qemu process is *always* terminated prior to the call to virNetDevOpenvswitchRemovePort(). So at most, your patch might be making the race window smaller in some cases, but it isn't eliminating it.
Fixing this race condition requires something more than just
adding an extra clause to a conditional. It may be possible to
tell OVS to automatically delete the port as the tap is deleted
(which would be nice, but I'm actually not expecting to find a way
to do that), or it may require libvirt to name and track tap
devices itself (as it already does for macvtap devices), which
*also* has problems - in particular whether or not we need to
account for the possibility of multiple simultaneous libvirtd
processes)
libvirt handles ovs logs:Jun 10 19:11:32 zbs-sh-elf-11 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port vnet4 -- add-port ovsbr-mgt vnet4 tag=0 -- set Interface vnet4 "external-ids:attached-mac=\"52:54:00:92:7e:7f\"" -- set Interface vnet4 "external-ids:iface-id=\"afb3a67a-5e5d-4ca6-b625-ebce6a9c8d03\"" -- set Interface vnet4 "external-ids:vm-id=\"7b9e4d5a-e8e9-4527-9b89-dd1f74d02526\"" -- set Interface vnet4 external-ids:iface-status=active
Jun 10 19:11:32 zbs-sh-elf-11 kernel: device vnet4 entered promiscuous mode
Jun 10 19:11:32 zbs-sh-elf-11 kernel: device vnet4 left promiscuous mode
Jun 10 19:11:32 zbs-sh-elf-11 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port vnet4 -- add-port ovsbr-mgt vnet4 tag=0 -- set Interface vnet4 "external-ids:attached-mac=\"52:54:00:b7:f4:07\"" -- set Interface vnet4 "external-ids:iface-id=\"c837d02d-4a4e-4f9c-9bee-7e5efce01a8e\"" -- set Interface vnet4 "external-ids:vm-id=\"83035f1e-faed-43d6-951e-08c90c9006a9\"" -- set Interface vnet4 external-ids:iface-status=active
Jun 10 19:11:32 zbs-sh-elf-11 kernel: device vnet4 entered promiscuous mode
Jun 10 19:11:32 zbs-sh-elf-11 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=5 -- --if-exists del-port vnet4
Thanks
Laine Stump <laine@xxxxxxxxxx> 于2020年6月16日周二 上午10:01写道:
On 6/15/20 2:04 PM, Daniel Henrique Barboza wrote:
>
>
> On 6/12/20 3:18 AM, gongwei@xxxxxxxxxx wrote:
>> From: gongwei <gongwei@xxxxxxxxxx>
>>
>> start to failed will not remove the openvswitch port,
>> the port recycling in this case lets openvswitch handle it by itself
>>
>> Signed-off-by: gongwei <gongwei@xxxxxxxxxx>
>> ---
>
> Can you please elaborate on the commit message? By the commit title and
> the code, I'm assuming that you're saying that we shouldn't remove the
> openvswitch port if the QEMU process failed to start, for any other
> reason aside from SHUTOFF_FAILED.
More importantly, what "port recycling" will take effect dependent on
how the qemu process is stopped (which I would think wouldn't make any
different to OVS), and why is it necessary for libvirt to not do it.
Up until now, what I have known is that ports will not be removed from
an OVS switch unless they are explicitly removed with ovs-vsctl, and
this attachment will persist across reboots of the host system. As a
matter of fact I've had cases during development where libvirt didn't
remove the OVS port for a tap device when a guest was terminated, and
then many *days* (and several reboots) later the same tap device name
was used for a different guest that was using a Linux host bridge, and
the tap device failed to attach to the Linux host bridge because it had
already been auto-attached back to the OVS switch as soon as it was created.
Can you desccribe how to reproduce the situation where libvirt removes
the OVS port when it shouldn't, and what is the bad outcome of that
happening?
>
> The code itself looks ok.
>
>
>
>> src/qemu/qemu_process.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index d36088ba98..439bd5b396 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -7482,7 +7482,8 @@ void qemuProcessStop(virQEMUDriverPtr driver,
>> if (vport) {
>> if (vport->virtPortType ==
>> VIR_NETDEV_VPORT_PROFILE_MIDONET) {
>> ignore_value(virNetDevMidonetUnbindPort(vport));
>> - } else if (vport->virtPortType ==
>> VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH) {
>> + } else if (vport->virtPortType ==
>> VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH &&
>> + reason != VIR_DOMAIN_SHUTOFF_FAILED) {
>> ignore_value(virNetDevOpenvswitchRemovePort(
>> virDomainNetGetActualBridgeName(net),
>> net->ifname));
>>
>
--
龚伟
手机:18883262137