On Mon, Jan 17, 2022 at 11:25:12AM +0100, David Hildenbrand wrote: > On 17.01.22 09:40, Michael S. Tsirkin wrote: > > On Mon, Jan 17, 2022 at 09:31:56AM +0100, David Hildenbrand wrote: > >> On 17.01.22 08:55, Michael S. Tsirkin wrote: > >>> On Mon, Jan 17, 2022 at 02:40:11PM +0800, Jason Wang wrote: > >>>> > >>>> 在 2022/1/15 上午5:43, Michael S. Tsirkin 写道: > >>>>> A common pattern for device reset is currently: > >>>>> vdev->config->reset(vdev); > >>>>> .. cleanup .. > >>>>> > >>>>> reset prevents new interrupts from arriving and waits for interrupt > >>>>> handlers to finish. > >>>>> > >>>>> However if - as is common - the handler queues a work request which is > >>>>> flushed during the cleanup stage, we have code adding buffers / trying > >>>>> to get buffers while device is reset. Not good. > >>>>> > >>>>> This was reproduced by running > >>>>> modprobe virtio_console > >>>>> modprobe -r virtio_console > >>>>> in a loop, and this reasoning seems to apply to virtio mem though > >>>>> I could not reproduce it there. > >>>>> > >>>>> Fix this up by calling virtio_break_device + flush before reset. > >>>>> > >>>>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > >>>>> --- > >>>>> drivers/virtio/virtio_mem.c | 2 ++ > >>>>> 1 file changed, 2 insertions(+) > >>>>> > >>>>> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c > >>>>> index 38becd8d578c..33b8a118a3ae 100644 > >>>>> --- a/drivers/virtio/virtio_mem.c > >>>>> +++ b/drivers/virtio/virtio_mem.c > >>>>> @@ -2888,6 +2888,8 @@ static void virtio_mem_remove(struct virtio_device *vdev) > >>>>> virtio_mem_deinit_hotplug(vm); > >>>>> /* reset the device and cleanup the queues */ > >>>>> + virtio_break_device(vdev); > >>>>> + flush_work(&vm->wq); > >>>> > >>>> > >>>> We set vm->removing to true and call cancel_work_sync() in > >>>> virtio_mem_deinit_hotplug(). Isn't is sufficient? > >>>> > >>>> Thanks > >>> > >>> > >>> Hmm I think you are right. David, I will drop this for now. > >>> Up to you to consider whether some central capability will be > >>> helpful as a replacement for the virtio-mem specific "removing" flag. > >> > >> It's all a bit tricky because we also have to handle pending timers and > >> pending memory onlining/offlining operations in a controlled way. Maybe > >> we could convert to virtio_break_device() and use the > >> &dev->vqs_list_lock as a replacement for the removal_lock. However, I'm > >> not 100% sure if it's nice to use that lock from > >> drivers/virtio/virtio_mem.c directly. > > > > We could add an API if you like. Or maybe it makes sense to add a > > separate one that lets you find out that removal started. Need to figure > > out how to handle suspend too then ... > > Generally we have these checks that device is not going away > > sprinkled over all drivers and I don't like it, but > > it's not easy to build a sane API to handle it, especially > > for high speed things when we can't take locks because performance. > > The interesting case might be how to handle virtio_mem_retry(), whereby > we conditionally queue work if !removing. > > Having that said, in an ideal world we could deny removal requests for > virtio_mem completely if there is still any memory added to the system ... > > > -- > Thanks, > > David / dhildenb removal requests might come from guest admin. -- MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization