On 17.01.22 09:40, Michael S. Tsirkin wrote: > On Mon, Jan 17, 2022 at 09:31:56AM +0100, David Hildenbrand wrote: >> On 17.01.22 08:55, Michael S. Tsirkin wrote: >>> On Mon, Jan 17, 2022 at 02:40:11PM +0800, Jason Wang wrote: >>>> >>>> 在 2022/1/15 上午5:43, Michael S. Tsirkin 写道: >>>>> A common pattern for device reset is currently: >>>>> vdev->config->reset(vdev); >>>>> .. cleanup .. >>>>> >>>>> reset prevents new interrupts from arriving and waits for interrupt >>>>> handlers to finish. >>>>> >>>>> However if - as is common - the handler queues a work request which is >>>>> flushed during the cleanup stage, we have code adding buffers / trying >>>>> to get buffers while device is reset. Not good. >>>>> >>>>> This was reproduced by running >>>>> modprobe virtio_console >>>>> modprobe -r virtio_console >>>>> in a loop, and this reasoning seems to apply to virtio mem though >>>>> I could not reproduce it there. >>>>> >>>>> Fix this up by calling virtio_break_device + flush before reset. >>>>> >>>>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> >>>>> --- >>>>> drivers/virtio/virtio_mem.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c >>>>> index 38becd8d578c..33b8a118a3ae 100644 >>>>> --- a/drivers/virtio/virtio_mem.c >>>>> +++ b/drivers/virtio/virtio_mem.c >>>>> @@ -2888,6 +2888,8 @@ static void virtio_mem_remove(struct virtio_device *vdev) >>>>> virtio_mem_deinit_hotplug(vm); >>>>> /* reset the device and cleanup the queues */ >>>>> + virtio_break_device(vdev); >>>>> + flush_work(&vm->wq); >>>> >>>> >>>> We set vm->removing to true and call cancel_work_sync() in >>>> virtio_mem_deinit_hotplug(). Isn't is sufficient? >>>> >>>> Thanks >>> >>> >>> Hmm I think you are right. David, I will drop this for now. >>> Up to you to consider whether some central capability will be >>> helpful as a replacement for the virtio-mem specific "removing" flag. >> >> It's all a bit tricky because we also have to handle pending timers and >> pending memory onlining/offlining operations in a controlled way. Maybe >> we could convert to virtio_break_device() and use the >> &dev->vqs_list_lock as a replacement for the removal_lock. However, I'm >> not 100% sure if it's nice to use that lock from >> drivers/virtio/virtio_mem.c directly. > > We could add an API if you like. Or maybe it makes sense to add a > separate one that lets you find out that removal started. Need to figure > out how to handle suspend too then ... > Generally we have these checks that device is not going away > sprinkled over all drivers and I don't like it, but > it's not easy to build a sane API to handle it, especially > for high speed things when we can't take locks because performance. The interesting case might be how to handle virtio_mem_retry(), whereby we conditionally queue work if !removing. Having that said, in an ideal world we could deny removal requests for virtio_mem completely if there is still any memory added to the system ... -- Thanks, David / dhildenb _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization