On Thu, 03/12 17:22, Michael S. Tsirkin wrote: > On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote: > > On Wed, 03/11 10:06, Michael S. Tsirkin wrote: > > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote: > > > > Currently shutdown is nop for virtio devices, but the core code could > > > > remove things behind us such as MSI-X handler etc. For example in the > > > > case of virtio-scsi-pci, the device may still try to send interupts, > > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will > > > > be unhandled, and may cause flood. > > > > Here is the problem I want to solve - file system driver hang: > > > > If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown > > disabled msix, it will never make progress because the requests it waits for > > will never be completed. So the system hangs. > > Paolo says that pci reset of virtio scsi device guarantees > that all outstanding requests complete. > > If true and implemented correctly, I don't see what else > needs to be done. > > You will need to debug this some more. First of all I was wrong about the fs driver above, scratch that, I'm sorry for the misleading. Regarding the hang in shutdown, Ulrich Obergfell has already pointed out that the vcpu is "busy/stuck in interrupt processing": https://bugzilla.redhat.com/attachment.cgi?id=998391 (RHBZ 1199155) Summary: The reason it is stuck is that an IRQ from virtio-scsi-pci is not handled. Why is there that IRQ? Because pci core code disabled msix. Why is it not handled? Because it's done behind virtio-scsi, who still is waiting for msix. "Hence, the interrupt will not be acknowledged and the guest becomes flooded with IRQ 11 interrupt." Fortunately it's not a livelock for upstream, because of: commit 184564efae4d775225c8fe3b762a56956fb1f827 Author: Zhang Haoyu <zhanghy@xxxxxxxxxxx> Date: Thu Sep 11 16:47:04 2014 +0800 kvm: ioapic: conditionally delay irq delivery duringeoi broadcast But we still should do the shutdown right. I also propose to not shutdown msix from pci core shutdown if the device doesn't have shutdown function: http://www.spinics.net/lists/kernel/msg1944041.html With that patch is applied, the "nop" .shutdown in virtio-pci shouldn't hurt much. Regarding handing the requests, now I don't know if we really care about them at shutdown. As you said, waiting for requests may cause more hang. Ideas? Fam _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization