On 12/03/2015 17:22, Michael S. Tsirkin wrote: > On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote: >> On Wed, 03/11 10:06, Michael S. Tsirkin wrote: >>> On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote: >>>> Currently shutdown is nop for virtio devices, but the core code could >>>> remove things behind us such as MSI-X handler etc. For example in the >>>> case of virtio-scsi-pci, the device may still try to send interupts, >>>> which will be on IRQ lines seeing MSI-X disabled. Those interrupts will >>>> be unhandled, and may cause flood. >> >> Here is the problem I want to solve - file system driver hang: >> >> If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown >> disabled msix, it will never make progress because the requests it waits for >> will never be completed. So the system hangs. > > Paolo says that pci reset of virtio scsi device guarantees > that all outstanding requests complete. For what it's worth, see here: static void virtio_scsi_reset(VirtIODevice *vdev) { ... s->resetting++; qbus_reset_all(&s->bus.qbus); s->resetting--; ... } static void scsi_disk_reset(DeviceState *dev) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev); uint64_t nb_sectors; scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET)); ... } Paolo > If true and implemented correctly, I don't see what else > needs to be done. > > You will need to debug this some more. > > >> In other words we will want to reset virtio device before pci_device_shutdown >> AND wake up all waiters. >> >> Unfortunately, neither your patch nor mine does that, because virtio bus can be >> shutdown after pci bus (thanks to Jason for pointing out this). In that case, >> any completion after disabling msix is lost. >> >> Maybe we need both the pci shutdown handler to reset the device and the virtio >> shutdown handler to remove the device? >> >> Fam >> >>> >>> This sounds very tentative. Do you, in fact, observe some problems >>> with virtio scsi? How to reproduce them? this needs to go >>> into the commit messages. >> >> OK, my bad. >> >>> >>>> Remove the device in "shutdown" callback to allow device drivers clean >>>> up things. >>>> >>>> Signed-off-by: Fam Zheng <famz@xxxxxxxxxx> >>> >>> I'm concerned this will cause more hangs on shutdown: one >>> of the reasons for reboot is device mal-functioning. >>> How about we just reset devices instead? Something like >>> the below (untested). >>> >>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> >>> >>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c >>> index 5ce2aa4..0769941 100644 >>> --- a/drivers/virtio/virtio.c >>> +++ b/drivers/virtio/virtio.c >>> @@ -269,6 +269,17 @@ static int virtio_dev_remove(struct device *_d) >>> return 0; >>> } >>> >>> +static void virtio_dev_shutdown(struct device *_d) >>> +{ >>> + struct virtio_device *dev = dev_to_virtio(_d); >>> + /* >>> + * Reset the device to make it stop sending interrupts, DMA, etc. >>> + * We are shutting down, no need for full cleanup. >>> + */ >>> + dev->config->reset(dev); >>> + >>> +} >>> + >>> static struct bus_type virtio_bus = { >>> .name = "virtio", >>> .match = virtio_dev_match, >>> @@ -276,6 +288,7 @@ static struct bus_type virtio_bus = { >>> .uevent = virtio_uevent, >>> .probe = virtio_dev_probe, >>> .remove = virtio_dev_remove, >>> + .shutdown = virtio_dev_shutdown, >>> }; >>> >>> bool virtio_device_is_legacy_only(struct virtio_device_id id) _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization