On Fr, 2019-04-19 at 09:17 +0200, Jan Klötzke wrote: > Hi David, > > On Thu, Apr 18, 2019 at 04:35:44PM -0700, David Miller wrote: > > From: Kloetzke Jan <Jan.Kloetzke@xxxxxxx> > > Date: Thu, 18 Apr 2019 08:02:59 +0000 > > > > > I think this assumption is not correct. As far as I understand the > > > networking code it is still possible that the ndo_start_xmit callback > > > is called while ndo_stop is running and even after ndo_stop has > > > returned. You can only be sure after unregister_netdev() has returned. > > > Maybe some networking folks can comment on that. > > > > The kernel loops over the devices being unregistered, and first it clears > > the __LINK_STATE_START on all of them, then it invokes ->ndo_stop() on > > all of them. > > > > __LINK_STATE_START controls what netif_running() returns. > > > > All calls to ->ndo_start_xmit() are guarded by netif_running() checks. > > > > So when ndo_stop is invoked you should get no more ndo_start_xmit > > invocations on that device. Otherwise how could you shut down DMA > > resources and turn off the TX engine properly? > > But you could still race with another CPU that is past the > netif_running() check, can you? So the driver has to make sure that it > gracefully handles concurrent ->ndo_start_xmit() and ->ndo_stop() calls. Looking at dev_direct_xmit(struct sk_buff *skb, u16 queue_id) this indeed seems possible. But the documentation says that it is not. Dave? > Or are there any locks/barriers involved that make sure all > ->ndo_start_xmit() calls have returned before invoking ->ndo_stop()? Jan, could you make versio of your patch that gives a WARNing if this race triggers? Regards Oliver