On Mon, Nov 21, 2022 at 09:29:35PM +0800, Li Zetao wrote: > When doing the following test steps, an error was found: > step 1: modprobe virtio_net succeeded > # modprobe virtio_net <-- OK > > step 2: fault injection in register_netdevice() > # modprobe -r virtio_net <-- OK > # ... > FAULT_INJECTION: forcing a failure. > name failslab, interval 1, probability 0, space 0, times 0 > CPU: 0 PID: 3521 Comm: modprobe > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > Call Trace: > <TASK> > ... > should_failslab+0xa/0x20 > ... > dev_set_name+0xc0/0x100 > netdev_register_kobject+0xc2/0x340 > register_netdevice+0xbb9/0x1320 > virtnet_probe+0x1d72/0x2658 [virtio_net] > ... > </TASK> > virtio_net: probe of virtio0 failed with error -22 > > step 3: modprobe virtio_net failed > # modprobe virtio_net <-- failed > virtio_net: probe of virtio0 failed with error -2 > > The root cause of the problem is that the queues are not > disable on the error handling path when register_netdevice() > fails in virtnet_probe(), resulting in an error "-ENOENT" > returned in the next modprobe call in setup_vq(). > > virtio_pci_modern_device uses virtqueues to send or > receive message, and "queue_enable" records whether the > queues are available. In vp_modern_find_vqs(), all queues > will be selected and activated, but once queues are enabled > there is no way to go back except reset. > > Fix it by reset virtio device on error handling path. > > Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver") > Signed-off-by: Li Zetao <lizetao1@xxxxxxxxxx> I would add to this: ------ This makes error handling follow the same order as normal device cleanup which does: static void remove_vq_common(struct virtnet_info *vi) { virtio_reset_device(vi->vdev); /* Free unused buffers in both send and recv, if any. */ free_unused_bufs(vi); free_receive_bufs(vi); free_receive_page_frags(vi); virtnet_del_vqs(vi); } static void virtnet_remove(struct virtio_device *vdev) { struct virtnet_info *vi = vdev->priv; virtnet_cpu_notif_remove(vi); /* Make sure no work handler is accessing the device. */ flush_work(&vi->config_work); unregister_netdev(vi->dev); net_failover_destroy(vi->failover); remove_vq_common(vi); free_netdev(vi->dev); } So unregister, destroy failover, then reset - and that flow is better tested than error handling so we can be reasonably sure it works well. ----- I would thus probably also include this tag instead: Fixes: 0246555550 ("virtio_net: fix use after free on allocation failure") this is what introduced the difference in cleanup order, modern driver just added hardware support. Besides extending the commit log Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > --- > drivers/net/virtio_net.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 7106932c6f88..86e52454b5b5 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev) > return 0; > > free_unregister_netdev: > - virtio_reset_device(vdev); > - > unregister_netdev(dev); > free_failover: > net_failover_destroy(vi->failover); > free_vqs: > + virtio_reset_device(vdev); > cancel_delayed_work_sync(&vi->refill); > free_receive_page_frags(vi); > virtnet_del_vqs(vi); > -- > 2.25.1 _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization