Hello Vladimir! On Wed, 2024-09-11 at 19:28 +0300, Vladimir Oltean wrote: > On Wed, Sep 11, 2024 at 04:40:03PM +0200, A. Sverdlin wrote: > > From: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxxxx> > > > > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > > slaves from master. Packets still come afterwards into master port and the > > ports are being polled for link status. This leads to crashes: > > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > > CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1 > > Workqueue: events_power_efficient phy_state_machine > > pc : lan9303_mdio_phy_read > > lr : lan9303_phy_read > > Call trace: > > lan9303_mdio_phy_read > > lan9303_phy_read > > dsa_slave_phy_read > > __mdiobus_read > > mdiobus_read > > genphy_update_link > > genphy_read_status > > phy_check_link_status > > phy_state_machine > > process_one_work > > worker_thread > > > > Call lan9303_remove() instead to really unregister all ports before zeroing > > drvdata and dsa_ptr. > > > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@xxxxxxxxxxx> > > --- > > drivers/net/dsa/lan9303-core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c > > index 268949939636..ecd507355f51 100644 > > --- a/drivers/net/dsa/lan9303-core.c > > +++ b/drivers/net/dsa/lan9303-core.c > > @@ -1477,7 +1477,7 @@ EXPORT_SYMBOL(lan9303_remove); > > > > void lan9303_shutdown(struct lan9303 *chip) > > { > > - dsa_switch_shutdown(chip->ds); > > + lan9303_remove(chip); > > } > > EXPORT_SYMBOL(lan9303_shutdown); > > > > -- > > 2.46.0 > > > > You've said here that a similar change still does not protect against > packets received after shutdown: > https://lore.kernel.org/netdev/c5e0e67400816d68e6bf90b4a999bfa28c59043b.camel@xxxxxxxxxxx/ > > The difference between that and this is the extra lan9303_disable_processing_port() > calls here. But while that does disable RX on switch ports, it still doesn't wait > for pending RX frames to be processed. So the race is still open. No? This patch addresses the race of zeroing drvdata in static void lan9303_mdio_shutdown(struct mdio_device *mdiodev) { struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev); if (!sw_dev) return; lan9303_shutdown(&sw_dev->chip); dev_set_drvdata(&mdiodev->dev, NULL); } versus static int lan9303_mdio_phy_read(struct lan9303 *chip, int phy, int reg) { struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev); what you refer to is another race, zeroing of dsa_ptr in struct net_device versus the whole network stack, which I addressed in https://lore.kernel.org/netdev/20240910130321.337154-2-alexander.sverdlin@xxxxxxxxxxx/ -- Alexander Sverdlin Siemens AG www.siemens.com