Re: [PATCH v2 4/5] qemu_hotplug: Fix a rare race condition when detaching a device twice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/14/19 3:14 PM, Peter Krempa wrote:
On Thu, Mar 14, 2019 at 14:56:48 +0100, Michal Privoznik wrote:
On 3/14/19 2:18 PM, Peter Krempa wrote:
On Thu, Mar 14, 2019 at 13:22:38 +0100, Michal Privoznik wrote:

[...]


How can this be considered success? Also this introduces a possible
regression. The DEVICE_DELETED event should be fired only after the
device was entirely unplugged. Claiming success before seeing the event
can lead to another race when qemu deleted the device from the internal
list so that 'device_del' does not see it any more but did not finish
cleanup fully.

We need to start the '*Remove' handler only after the DEVICE_DELETED
event was received.

I beg to differ. If we were to report error here users would see the API
failing with error "Device not found". So they'd run 'virsh dumpxml' only to
find the device there. I don't find such behaviour sane. If one API tells me
a devie is not there then another one shall not tell otherwise.

Well. The user semantics can be confusing here. What we can't allow
though is that some of the steps done in the qemuDomainRemove*Device
will fail because qemu will still have some internal reference to some
backend object.

I'm not quite sure I follow. qemuDomainRemove*Device will be run exactly once. Not any more times. Running it more times is a problem, but I'm failing to see how my patch allows that. Can you shed more light into that please?

What I'd find more of a problem is that I'd try to
attach a similar device only to be told that it already exists.

I'm don't know what you mean here either. With my patches not only we enter the wait for the event again (thus widening the window when the event may arrive), but we are actually compliant with the detach semantics. Let's think of an extreme case: qemu fails to deliver DEVICE_DELETED event. With my patches you'll get:

1) virsh detach-device-alias $dom $alias
Device detach request sent successfully

2) virsh detach-device-alias $dom $alias
Device detach request sent successfully

3) virsh detach-device-alias $dom $alias
Device detach request sent successfully
 ...

If we were to fail, as you suggest:
1) virsh detach-device-alias $dom $alias
Device detach request sent successfully

2) virsh detach-device-alias $dom $alias
monitor error: DeviceNotFound

3) virsh detach-device-alias $dom $alias
monitor error: DeviceNotFound


Now if you run 'virsh dumpxl $dom' as 4th step (for both scenarios) the device is still there. So how can it be in the domain XML and not found at the same time? And if you try to attach it, everything will work: libvirt generates a different address to plug the device to, since it still sees the old one.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux