On Fri, Jun 17, 2022 at 07:46:23PM +0800, Jason Wang wrote: > On Fri, Jun 17, 2022 at 6:13 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Fri, Jun 17, 2022 at 03:29:49PM +0800, Jason Wang wrote: > > > We used to call virtio_device_ready() after netdev registration. This > > > cause a race between ndo_open() and virtio_device_ready(): if > > > ndo_open() is called before virtio_device_ready(), the driver may > > > start to use the device before DRIVER_OK which violates the spec. > > > > > > Fixing this by switching to use register_netdevice() and protect the > > > virtio_device_ready() with rtnl_lock() to make sure ndo_open() can > > > only be called after virtio_device_ready(). > > > > > > Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early") > > > Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx> > > > --- > > > drivers/net/virtio_net.c | 8 +++++++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > index db05b5e930be..8a5810bcb839 100644 > > > --- a/drivers/net/virtio_net.c > > > +++ b/drivers/net/virtio_net.c > > > @@ -3655,14 +3655,20 @@ static int virtnet_probe(struct virtio_device *vdev) > > > if (vi->has_rss || vi->has_rss_hash_report) > > > virtnet_init_default_rss(vi); > > > > > > - err = register_netdev(dev); > > > + /* serialize netdev register + virtio_device_ready() with ndo_open() */ > > > + rtnl_lock(); > > > + > > > + err = register_netdevice(dev); > > > if (err) { > > > pr_debug("virtio_net: registering device failed\n"); > > > + rtnl_unlock(); > > > goto free_failover; > > > } > > > > > > virtio_device_ready(vdev); > > > > > > + rtnl_unlock(); > > > + > > > err = virtnet_cpu_notif_add(vi); > > > if (err) { > > > pr_debug("virtio_net: registering cpu notifier failed\n"); > > > > > > Looks good but then don't we have the same issue when removing the > > device? > > > > Actually I looked at virtnet_remove and I see > > unregister_netdev(vi->dev); > > > > net_failover_destroy(vi->failover); > > > > remove_vq_common(vi); <- this will reset the device > > > > a window here? > > Probably. For safety, we probably need to reset before unregistering. careful not to create new races, let's analyse this one to be sure first. > > > > > > Really, I think what we had originally was a better idea - > > instead of dropping interrupts they were delayed and > > when driver is ready to accept them it just enables them. > > The problem is that it works only on some specific setup: > > - doesn't work on shared IRQ > - doesn't work on some specific driver e.g virtio-blk can some core irq work fix that? > > We just need to make sure driver does not wait for > > interrupts before enabling them. > > > > And I suspect we need to make this opt-in on a per driver > > basis. > > Exactly. > > Thanks > > > > > > > > > > -- > > > 2.25.1 > > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization