On Fri, 5 Feb 2016 16:48:07 +0100 Dominique Martinet <asmadeus@xxxxxxxxxxxxx> wrote: > Alex Williamson wrote on Fri, Feb 05, 2016: > > I just debugged your case earlier in the week and the bug is with > > the test case. > > Thank you for the extra information and sorry for double work. > Getting technical informations through support is hard... > > > When vhost is used it takes a reference to the process mm (qemu). > > That reference includes the mmap regions on the vfio device file. > > vhost releases those references when the vhostfd file descriptor > > is released. > > So in the scenario you have here, killing qemu doesn't release the > > vhostfd file descriptor because it's still opened in the script. > > The vfio device is not released because there's still a reference > > to the mmap. You've essentially put yourself into a deadlock > > The solution is to close the vhostfd file descriptor in your test > > script after launching qemu (echo 10<&-). Then qemu will hold the > > last reference to vhostfd and killing qemu will release that file > > descriptor and everything is released as intended. > > doh, I was sure I also had the hang when giving /dev/vhost-net to qemu > and letting it open the fd so I wasn't looking at it at all, but > you've got it. > (I think I didn't have the hang as root because I didn't bother with > vhostfd then either, the devil is in the details...) > > Closing the fd even after qemu has stopped will free the resource and > let me unbind, so I will just make sure to order vhost-net-related to > be closed before vfio stuff for now. > > > Really not an obvious lock at first glance though, not sure how this > could be 'fixed' now you've explained it so I'll just let you guys > decide how to handle it. This has been very helpful. > > > > I don't believe standard management tools like libvirt have this > > problem. > > I'm pretty sure this would have been pointed at ages ago if libvirt > had the problem :) > > Thank you for your time, Yes, it's unfortunate that management tools need to be aware of these sorts of semantics but it's somewhat fundamental in the mechanism of providing access through file descriptors. When QEMU is killed, there's no opportunity for cleanup, so all of the kernel interfaces need to do this automatically, but the trigger for that is when the release callbacks for the open files get called. Therefore the owner of that file, the shell in the example case, needs to not only give QEMU access, but let it hold the only outstanding references. Another unique feature of your test case is that the unbind is called from the same shell that owns that same open vhost file descriptor. The unbind of course blocks until the device is unused because there is no opportunity for a -EBUSY return through that path in the kernel. If the unbind is executed from a separate shell, then we don't have that interaction, qemu can be killed, which doesn't immediately release the vhostfd, but the shell is not blocked and can exit and everything releases as expected again. Anyway, I appreciate your concise test case, I wasn't aware of that interaction either, but it was an interesting problem to investigate. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html