On 5/25/21 10:24 AM, Hannes Reinecke wrote: > On 5/25/21 6:47 PM, Stefan Hajnoczi wrote: >> On Mon, May 24, 2021 at 11:33:33PM -0700, Dongli Zhang wrote: >>> On 5/24/21 6:24 AM, Stefan Hajnoczi wrote: >>>> On Sun, May 23, 2021 at 09:39:51AM +0200, Hannes Reinecke wrote: >>>>> On 5/23/21 8:38 AM, Dongli Zhang wrote: >>>>>> This RFC is to trigger the discussion about to poll and kick the >>>>>> virtqueue on purpose in virtio-scsi timeout handler. >>>>>> >>>>>> The virtio-scsi relies on the virtio vring shared between VM and host. >>>>>> The VM side produces requests to vring and kicks the virtqueue, while the >>>>>> host side produces responses to vring and interrupts the VM side. >>>>>> >>>>>> By default the virtio-scsi handler depends on the host timeout handler >>>>>> by BLK_EH_RESET_TIMER to give host a chance to perform EH. >>>>>> >>>>>> However, this is not helpful for the case that the responses are available >>>>>> on vring but the notification from host to VM is lost. >>>>>> >>>>> How can this happen? >>>>> If responses are lost the communication between VM and host is broken, and >>>>> we should rather reset the virtio rings themselves. >>>> >>>> I agree. In principle it's fine to poll the virtqueue at any time, but I >>>> don't understand the failure scenario here. It's not clear to me why the >>>> device-to-driver vq notification could be lost. >>>> >>> >>> One example is the CPU hotplug issue before the commit bf0beec0607d ("blk-mq: >>> drain I/O when all CPUs in a hctx are offline") was available. The issue is >>> equivalent to loss of interrupt. Without the CPU hotplug fix, while NVMe driver >>> relies on the timeout handler to complete inflight IO requests, the PV >>> virtio-scsi may hang permanently. >>> >>> In addition, as the virtio/vhost/QEMU are complex software, we are not able to >>> guarantee there is no further lost of interrupt/kick issue in the future. It is >>> really painful if we encounter such issue in production environment. >> >> Any number of hardware or software bugs might exist that we don't know >> about, yet we don't pre-emptively add workarounds for them because where >> do you draw the line? >> >> I checked other SCSI/block drivers and found it's rare to poll in the >> timeout function so there does not seem to be a consensus that it's >> useful to do this. >> > Not only this; it's downright dangerous attempting to do that in SCSI. > In SCSI we don't have fixed lifetime guarantees that NVMe has, so there will be > a race condition between timeout and command completion. Thank you very much for the explanation. Yes, we cannot do that due to the race. Dongli Zhang > Plus there is no interface in SCSI allowing to 'poll' for completions in a > meaningful manner. > >> That said, it's technically fine to do it, the virtqueue APIs are there >> and can be used like this. So if you and others think this is necessary, >> then it's a pretty small change and I'm not against merging a patch like >> this. >> > I would rather _not_ put more functionality into the virtio_scsi timeout > handler; this only serves to assume that the timeout handler has some > functionality in virtio. > Which it patently hasn't, as the prime reason for a timeout handler is to > _abort_ a command, which we can't on virtio. > Well, we can on virtio, but qemu as the main user will re-route the I/O from > virtio into doing async-I/O, and there is no way how we can abort outstanding > asynchronous I/O. > Or any other ioctl, for that matter. > > Cheers, > > Hannes