On Thu, Dec 10, 2020 at 03:35:29PM +0100, Marcin Wojtas wrote: > Hi Greg, > > śr., 9 gru 2020 o 11:59 Greg Kroah-Hartman > <gregkh@xxxxxxxxxxxxxxxxxxx> napisał(a): > > What part fixes the issue? I can't see it... > > I re-checked in my setup and here's the smallest part of the original > patch, that fixes previously described issue: > diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c > b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c > index e98be8372780..9d71a4fe1750 100644 > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c > @@ -4767,6 +4767,11 @@ static void mvpp2_port_copy_mac_addr(struct > net_device *dev, struct mvpp2 *priv, > eth_hw_addr_random(dev); > } > > +static struct mvpp2_port *mvpp2_phylink_to_port(struct phylink_config *config) > +{ > + return container_of(config, struct mvpp2_port, phylink_config); > +} > + > static void mvpp2_phylink_validate(struct phylink_config *config, > unsigned long *supported, > struct phylink_link_state *state) > @@ -5105,13 +5110,12 @@ static void mvpp2_gmac_config(struct > mvpp2_port *port, unsigned int mode, > static void mvpp2_mac_config(struct phylink_config *config, unsigned int mode, > const struct phylink_link_state *state) > { > - struct net_device *dev = to_net_dev(config->dev); > - struct mvpp2_port *port = netdev_priv(dev); > + struct mvpp2_port *port = mvpp2_phylink_to_port(config); > bool change_interface = port->phy_interface != state->interface; > > /* Check for invalid configuration */ > if (mvpp2_is_xlg(state->interface) && port->gop_id != 0) { > - netdev_err(dev, "Invalid mode on %s\n", dev->name); > + netdev_err(port->dev, "Invalid mode on %s\n", port->dev->name); > return; > } > > @@ -5151,8 +5155,7 @@ static void mvpp2_mac_link_up(struct > phylink_config *config, > int speed, int duplex, > bool tx_pause, bool rx_pause) > { > - struct net_device *dev = to_net_dev(config->dev); > - struct mvpp2_port *port = netdev_priv(dev); > + struct mvpp2_port *port = mvpp2_phylink_to_port(config); > u32 val; > > if (mvpp2_is_xlg(interface)) { > @@ -5199,7 +5202,7 @@ static void mvpp2_mac_link_up(struct > phylink_config *config, > > mvpp2_egress_enable(port); > mvpp2_ingress_enable(port); > - netif_tx_wake_all_queues(dev); > + netif_tx_wake_all_queues(port->dev); > } > > static void mvpp2_mac_link_down(struct phylink_config *config, The problem is caused by this hack: /* Phylink isn't used as of now for ACPI, so the MAC has to be * configured manually when the interface is started. This will * be removed as soon as the phylink ACPI support lands in. */ struct phylink_link_state state = { .interface = port->phy_interface, }; mvpp2_mac_config(&port->phylink_config, MLO_AN_INBAND, &state); mvpp2_mac_link_up(&port->phylink_config, MLO_AN_INBAND, port->phy_interface, NULL); which passes an un-initialised (zeroed) port->phylink_config, as phylink is not used in ACPI setups. The problem occurs because port->phylink_config.dev (which is a NULL pointer in this instance) is passed to to_net_dev(): #define to_net_dev(d) container_of(d, struct net_device, dev) Which then means netdev_priv(dev) attempts to dereference a not-quite NULL pointer, leading to an oops. The problem here is that the bug was not noticed; it seems hardly anyone bothers to run mainline kernels with ACPI on Marvell platforms, or if they do, they don't bother reporting to mainline communities when they have problems. Likely, there's posts on some random web-based bulletin board or mailing list that kernel developers don't read somewhere complaining that there's an oops. Like... https://lists.einval.com/pipermail/macchiato/2020-January/000309.html https://gist.github.com/AdrianKoshka/ff9862da2183a2d8e26d47baf8dc04b9 This kind of segmentation is very disappointing; it means potentially lots of bugs go by unnoticed by kernel developers, and bugs only get fixed by chance. Had it been reported to somewhere known earlier this year, it is likely that a proper fix patch would have been created. How this gets handled is ultimately up to the stable developers to decide what they wish to accept. Do they wish to accept a back-ported full version of my commit 6c2b49eb9671 ("net: mvpp2: add mvpp2_phylink_to_port() helper") that unintentionally fixed this unknown issue, or do they want a more minimal fix such as the cut-down version of that commit that Marcin has supplied. Until something changes in the way bugs get reported, I suspect this won't be the only instance of bug-fixing-by-accident happening. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!