On 03/28/2014 01:30 PM, xiexiangyou wrote: > Thanks for your reply. > > On 2014/3/27 22:14, Jiri Denemark wrote: > >> On Thu, Mar 27, 2014 at 20:51:24 +0800, x00221466 wrote: >>> Hi, >>> >>> When live detaching the virtual net device, such as virtio nic、 >>> RTL8139、E1000, there are some problems: >>> >>> (1)If the Guest OS don't support the hot plugging pci device, detach >>> the virtual network device by Libvirt, the "net device" in Qemu will >>> still exist, but "hostnet"(tap) in Qemu will be removed. so the net device >>> in Guest OS will be of no effect. >>> >>> (2)If reject the nic in Guest OS, Qemu will remove the "net device", >>> then Qemu send DEVICE_DELETED to Libvirt, Libvirt receive the event >>> in event-loop thread and release info of the net device in >>> qemuDomainRemoveNetDevice func. but "hostnet" in Qemu still exist. >>> So next live attaching virtual net device will be failed because of >>> "Duplicate ID". >>> >>> #virsh attach-device win2008_st_r2_64 net.xml --live >>> error: Failed to attach device from net.xml >>> error: internal error: unable to execute QEMU command 'netdev_add': >>> Duplicate ID 'hostnet0' for netdev >>> >>> (3)In addition, in qemuDomainDetachNetDevice, detach net device func, >>> "netdev_del" command will be sent after sending "device_del" command >>> at once. So it is violent to remove the tap device before the net device >>> is completely removed. >>> >>> So I think it's more logical that doing the work of sending Qemu command >>> "netdev_del" after receive the DEVICE_DELETED event. It can avoid the conflict >>> of device info between Libvirt side and Qemu side. >> This sounds like it could be correct, although I'd prefer Laine to >> express his opinion on this since he knows the corners in network device >> assignment... >> >>> I create a thread in qemuDomainRemoveDevice,the handle of DEVICE_DELETED event, >>> to execute QEMU command "netdev_del". >> Hmm, it took me some time to realize why you'd need to do this. It's >> because qemuDomainRemoveDevice is run from a DEVICE_DELETED event >> handler and thus it cannot talk back to the monitor, right? In that > > Yep! Sending the Qemu monitor command in event handler is no allowed, so I create > a new thread to do this. > >> case, I suggest spawning a thread for qemuDomainRemoveDevice itself >> within the event handler (qemuProcessHandleDeviceDeleted) so that all >> qemuDomainRemove* methods can talk to monitor if they need to. > > I will modify it as your suggest > >> To make the changes easier to follow, please do the change in two >> patches. The first one to move qemuDomainRemoveDevice into a new thread >> and the second one to move qemuMonitorRemoveNetdev and >> qemuMonitorRemoveHostNetwork calls inside qemuDomainRemoveNetDevice. >> >> But first, wait for Laine's input, please. Well, the level of my knowledge was that I noticed the problem caused by the asynchronous nature of device_del (exactly the error message that you're reporting) and reported this to QEMU, asking for an event to let us know when it is okay to reuse a device ID (i.e. the DEVICE_DELETED event). It appears that this isn't always good enough, though, so *something* apparently needs to be done. My understanding is that the problem is caused by the netdev_del being executed too soon after device_del, and then the device ID is forever lost due to the unclean "cleanup", is that correct? If so, then your solution sounds correct. But does netdev_del complete synchronously? If not, then we will also need a completion event for that as well. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list