Re: stable-rc/linux-5.18.y bisection: baseline.login on panda

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 24, 2022 at 01:22:12PM -0700, KernelCI bot wrote:

The KernelCI bisection bot identified 7eea9a60703ca ("usbnet: smsc95xx:
Forward PHY interrupts to PHY driver to avoid polling") as causing a
boot regression on Panda in v5.18.  The board stops detecting a link
which breks NFS boot:

<6>[    4.953613] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-4a064c00.ehci-1.1, smsc95xx USB 2.0 Ethernet, 02:03:01:8c:13:b0
<6>[    5.032928] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<6>[    5.044036] smsc95xx 2-1.1:1.0 eth0: Link is Down
<6>[   25.053924] Waiting up to 100 more seconds for network.
<6>[   45.074005] Waiting up to 80 more seconds for network.
<6>[   65.093933] Waiting up to 60 more seconds for network.
<6>[   85.123992] Waiting up to 40 more seconds for network.
<6>[  105.143951] Waiting up to 20 more seconds for network.
<5>[  125.084014] Sending DHCP requests ...... timed out!
<6>[  207.765808] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<3>[  207.773468] IP-Config: Reopening network devices...
<6>[  207.840332] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
<6>[  207.851226] smsc95xx 2-1.1:1.0 eth0: Link is Down
<6>[  227.873931] Waiting up to 100 more seconds for network.
<6>[  247.873931] Waiting up to 80 more seconds for network.

We're seen other failures on this board in mainline which stop boot
sooner so can't confirm if there's a similar issue there.

I've left the whole report, including links to more details like full
logs and a tag for the bot below:

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> *                                                               *
> * If you do send a fix, please include this trailer:            *
> *   Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx>          *
> *                                                               *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> stable-rc/linux-5.18.y bisection: baseline.login on panda
> 
> Summary:
>   Start:      22a992953741a Linux 5.18.19
>   Plain log:  https://storage.kernelci.org/stable-rc/linux-5.18.y/v5.18.19/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-panda.txt
>   HTML log:   https://storage.kernelci.org/stable-rc/linux-5.18.y/v5.18.19/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-panda.html
>   Result:     7eea9a60703ca usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       stable-rc
>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
>   Branch:     linux-5.18.y
>   Target:     panda
>   CPU arch:   arm
>   Lab:        lab-baylibre
>   Compiler:   gcc-10
>   Config:     multi_v7_defconfig
>   Test case:  baseline.login
> 
> Breaking commit found:
> 
> -------------------------------------------------------------------------------
> commit 7eea9a60703caf1b04eccba93cd103f1c8fc6809
> Author: Lukas Wunner <lukas@xxxxxxxxx>
> Date:   Thu May 12 10:42:05 2022 +0200
> 
>     usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
>     
>     [ Upstream commit 1ce8b37241ed291af56f7a49bbdbf20c08728e88 ]
>     
>     Link status of SMSC LAN95xx chips is polled once per second, even though
>     they're capable of signaling PHY interrupts through the MAC layer.
>     
>     Forward those interrupts to the PHY driver to avoid polling.  Benefits
>     are reduced bus traffic, reduced CPU overhead and quicker interface
>     bringup.
>     
>     Polling was introduced in 2016 by commit d69d16949346 ("usbnet:
>     smsc95xx: fix link detection for disabled autonegotiation").
>     Back then, the LAN95xx driver neglected to enable the ENERGYON interrupt,
>     hence couldn't detect link-up events when auto-negotiation was disabled.
>     The proper solution would have been to enable the ENERGYON interrupt
>     instead of polling.
>     
>     Since then, PHY handling was moved from the LAN95xx driver to the SMSC
>     PHY driver with commit 05b35e7eb9a1 ("smsc95xx: add phylib support").
>     That PHY driver is capable of link detection with auto-negotiation
>     disabled because it enables the ENERGYON interrupt.
>     
>     Note that signaling interrupts through the MAC layer not only works with
>     the integrated PHY, but also with an external PHY, provided its
>     interrupt pin is attached to LAN95xx's nPHY_INT pin.
>     
>     In the unlikely event that the interrupt pin of an external PHY is
>     attached to a GPIO of the SoC (or not connected at all), the driver can
>     be amended to retrieve the irq from the PHY's of_node.
>     
>     To forward PHY interrupts to phylib, it is not sufficient to call
>     phy_mac_interrupt().  Instead, the PHY's interrupt handler needs to run
>     so that PHY interrupts are cleared.  That's because according to page
>     119 of the LAN950x datasheet, "The source of this interrupt is a level.
>     The interrupt persists until it is cleared in the PHY."
>     
>     https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf
>     
>     Therefore, create an IRQ domain with a single IRQ for the PHY.  In the
>     future, the IRQ domain may be extended to support the 11 GPIOs on the
>     LAN95xx.
>     
>     Normally the PHY interrupt should be masked until the PHY driver has
>     cleared it.  However masking requires a (sleeping) USB transaction and
>     interrupts are received in (non-sleepable) softirq context.  I decided
>     not to mask the interrupt at all (by using the dummy_irq_chip's noop
>     ->irq_mask() callback):  The USB interrupt endpoint is polled in 1 msec
>     intervals and normally that's sufficient to wake the PHY driver's IRQ
>     thread and have it clear the interrupt.  If it does take longer, worst
>     thing that can happen is the IRQ thread is woken again.  No big deal.
>     
>     Because PHY interrupts are now perpetually enabled, there's no need to
>     selectively enable them on suspend.  So remove all invocations of
>     smsc95xx_enable_phy_wakeup_interrupts().
>     
>     In smsc95xx_resume(), move the call of phy_init_hw() before
>     usbnet_resume() (which restarts the status URB) to ensure that the PHY
>     is fully initialized when an interrupt is handled.
>     
>     Tested-by: Oleksij Rempel <o.rempel@xxxxxxxxxxxxxx> # LAN9514/9512/9500
>     Tested-by: Ferry Toth <fntoth@xxxxxxxxx> # LAN9514
>     Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
>     Reviewed-by: Andrew Lunn <andrew@xxxxxxx> # from a PHY perspective
>     Cc: Andre Edich <andre.edich@xxxxxxxxxxxxx>
>     Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
>     Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> 
> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index f5a208948d22a..358b170cc8fb7 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -18,6 +18,8 @@
>  #include <linux/usb/usbnet.h>
>  #include <linux/slab.h>
>  #include <linux/of_net.h>
> +#include <linux/irq.h>
> +#include <linux/irqdomain.h>
>  #include <linux/mdio.h>
>  #include <linux/phy.h>
>  #include <net/selftests.h>
> @@ -53,6 +55,9 @@
>  #define SUSPEND_ALLMODES		(SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
>  					 SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
>  
> +#define SMSC95XX_NR_IRQS		(1) /* raise to 12 for GPIOs */
> +#define PHY_HWIRQ			(SMSC95XX_NR_IRQS - 1)
> +
>  struct smsc95xx_priv {
>  	u32 mac_cr;
>  	u32 hash_hi;
> @@ -61,6 +66,9 @@ struct smsc95xx_priv {
>  	spinlock_t mac_cr_lock;
>  	u8 features;
>  	u8 suspend_flags;
> +	struct irq_chip irqchip;
> +	struct irq_domain *irqdomain;
> +	struct fwnode_handle *irqfwnode;
>  	struct mii_bus *mdiobus;
>  	struct phy_device *phydev;
>  };
> @@ -597,6 +605,8 @@ static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
>  
>  static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
>  {
> +	struct smsc95xx_priv *pdata = dev->driver_priv;
> +	unsigned long flags;
>  	u32 intdata;
>  
>  	if (urb->actual_length != 4) {
> @@ -608,11 +618,15 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
>  	intdata = get_unaligned_le32(urb->transfer_buffer);
>  	netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
>  
> +	local_irq_save(flags);
> +
>  	if (intdata & INT_ENP_PHY_INT_)
> -		;
> +		generic_handle_domain_irq(pdata->irqdomain, PHY_HWIRQ);
>  	else
>  		netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
>  			    intdata);
> +
> +	local_irq_restore(flags);
>  }
>  
>  /* Enable or disable Tx & Rx checksum offload engines */
> @@ -1098,8 +1112,9 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
>  {
>  	struct smsc95xx_priv *pdata;
>  	bool is_internal_phy;
> +	char usb_path[64];
> +	int ret, phy_irq;
>  	u32 val;
> -	int ret;
>  
>  	printk(KERN_INFO SMSC_CHIPNAME " v" SMSC_DRIVER_VERSION "\n");
>  
> @@ -1139,10 +1154,38 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
>  	if (ret)
>  		goto free_pdata;
>  
> +	/* create irq domain for use by PHY driver and GPIO consumers */
> +	usb_make_path(dev->udev, usb_path, sizeof(usb_path));
> +	pdata->irqfwnode = irq_domain_alloc_named_fwnode(usb_path);
> +	if (!pdata->irqfwnode) {
> +		ret = -ENOMEM;
> +		goto free_pdata;
> +	}
> +
> +	pdata->irqdomain = irq_domain_create_linear(pdata->irqfwnode,
> +						    SMSC95XX_NR_IRQS,
> +						    &irq_domain_simple_ops,
> +						    pdata);
> +	if (!pdata->irqdomain) {
> +		ret = -ENOMEM;
> +		goto free_irqfwnode;
> +	}
> +
> +	phy_irq = irq_create_mapping(pdata->irqdomain, PHY_HWIRQ);
> +	if (!phy_irq) {
> +		ret = -ENOENT;
> +		goto remove_irqdomain;
> +	}
> +
> +	pdata->irqchip = dummy_irq_chip;
> +	pdata->irqchip.name = SMSC_CHIPNAME;
> +	irq_set_chip_and_handler_name(phy_irq, &pdata->irqchip,
> +				      handle_simple_irq, "phy");
> +
>  	pdata->mdiobus = mdiobus_alloc();
>  	if (!pdata->mdiobus) {
>  		ret = -ENOMEM;
> -		goto free_pdata;
> +		goto dispose_irq;
>  	}
>  
>  	ret = smsc95xx_read_reg(dev, HW_CFG, &val);
> @@ -1175,6 +1218,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
>  		goto unregister_mdio;
>  	}
>  
> +	pdata->phydev->irq = phy_irq;
>  	pdata->phydev->is_internal = is_internal_phy;
>  
>  	/* detect device revision as different features may be available */
> @@ -1217,6 +1261,15 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
>  free_mdio:
>  	mdiobus_free(pdata->mdiobus);
>  
> +dispose_irq:
> +	irq_dispose_mapping(phy_irq);
> +
> +remove_irqdomain:
> +	irq_domain_remove(pdata->irqdomain);
> +
> +free_irqfwnode:
> +	irq_domain_free_fwnode(pdata->irqfwnode);
> +
>  free_pdata:
>  	kfree(pdata);
>  	return ret;
> @@ -1229,6 +1282,9 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
>  	phy_disconnect(dev->net->phydev);
>  	mdiobus_unregister(pdata->mdiobus);
>  	mdiobus_free(pdata->mdiobus);
> +	irq_dispose_mapping(irq_find_mapping(pdata->irqdomain, PHY_HWIRQ));
> +	irq_domain_remove(pdata->irqdomain);
> +	irq_domain_free_fwnode(pdata->irqfwnode);
>  	netif_dbg(dev, ifdown, dev->net, "free pdata\n");
>  	kfree(pdata);
>  }
> @@ -1253,29 +1309,6 @@ static u32 smsc_crc(const u8 *buffer, size_t len, int filter)
>  	return crc << ((filter % 2) * 16);
>  }
>  
> -static int smsc95xx_enable_phy_wakeup_interrupts(struct usbnet *dev, u16 mask)
> -{
> -	int ret;
> -
> -	netdev_dbg(dev->net, "enabling PHY wakeup interrupts\n");
> -
> -	/* read to clear */
> -	ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_SRC);
> -	if (ret < 0)
> -		return ret;
> -
> -	/* enable interrupt source */
> -	ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_MASK);
> -	if (ret < 0)
> -		return ret;
> -
> -	ret |= mask;
> -
> -	smsc95xx_mdio_write_nopm(dev, PHY_INT_MASK, ret);
> -
> -	return 0;
> -}
> -
>  static int smsc95xx_link_ok_nopm(struct usbnet *dev)
>  {
>  	int ret;
> @@ -1442,7 +1475,6 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
>  static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
>  {
>  	struct smsc95xx_priv *pdata = dev->driver_priv;
> -	int ret;
>  
>  	if (!netif_running(dev->net)) {
>  		/* interface is ifconfig down so fully power down hw */
> @@ -1461,27 +1493,10 @@ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
>  		}
>  
>  		netdev_dbg(dev->net, "autosuspend entering SUSPEND1\n");
> -
> -		/* enable PHY wakeup events for if cable is attached */
> -		ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> -			PHY_INT_MASK_ANEG_COMP_);
> -		if (ret < 0) {
> -			netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> -			return ret;
> -		}
> -
>  		netdev_info(dev->net, "entering SUSPEND1 mode\n");
>  		return smsc95xx_enter_suspend1(dev);
>  	}
>  
> -	/* enable PHY wakeup events so we remote wakeup if cable is pulled */
> -	ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> -		PHY_INT_MASK_LINK_DOWN_);
> -	if (ret < 0) {
> -		netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> -		return ret;
> -	}
> -
>  	netdev_dbg(dev->net, "autosuspend entering SUSPEND3\n");
>  	return smsc95xx_enter_suspend3(dev);
>  }
> @@ -1547,13 +1562,6 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
>  	}
>  
>  	if (pdata->wolopts & WAKE_PHY) {
> -		ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
> -			(PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
> -		if (ret < 0) {
> -			netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
> -			goto done;
> -		}
> -
>  		/* if link is down then configure EDPD and enter SUSPEND1,
>  		 * otherwise enter SUSPEND0 below
>  		 */
> @@ -1787,11 +1795,12 @@ static int smsc95xx_resume(struct usb_interface *intf)
>  			return ret;
>  	}
>  
> +	phy_init_hw(pdata->phydev);
> +
>  	ret = usbnet_resume(intf);
>  	if (ret < 0)
>  		netdev_warn(dev->net, "usbnet_resume error\n");
>  
> -	phy_init_hw(pdata->phydev);
>  	return ret;
>  }
> -------------------------------------------------------------------------------
> 
> 
> Git bisection log:
> 
> -------------------------------------------------------------------------------
> git bisect start
> # good: [9aa5a042881d4a99657f82c774e9e15353ebeb2d] Linux 5.18.14
> git bisect good 9aa5a042881d4a99657f82c774e9e15353ebeb2d
> # bad: [22a992953741ad79c07890d3f4104585e52ef26b] Linux 5.18.19
> git bisect bad 22a992953741ad79c07890d3f4104585e52ef26b
> # good: [77d5174ed962c259965226386f71575790646ec0] drm/mediatek: dpi: Remove output format of YUV
> git bisect good 77d5174ed962c259965226386f71575790646ec0
> # good: [598bc9541e2f82a7fe8dfe23891201b355a56cda] usb: dwc3: core: Do not perform GCTL_CORE_SOFTRESET during bootup
> git bisect good 598bc9541e2f82a7fe8dfe23891201b355a56cda
> # good: [876f57cc94922896cc71dd4696013a7c0558c9b4] f2fs: give priority to select unpinned section for foreground GC
> git bisect good 876f57cc94922896cc71dd4696013a7c0558c9b4
> # bad: [5f4e505909fe50a4e256704076594ee3def0b9b1] block: add a bdev_max_zone_append_sectors helper
> git bisect bad 5f4e505909fe50a4e256704076594ee3def0b9b1
> # good: [b9c3401f7cac6ae291a16784dadcd1bf116218fe] x86/kprobes: Update kcb status flag after singlestepping
> git bisect good b9c3401f7cac6ae291a16784dadcd1bf116218fe
> # bad: [73ce2046e04ad488cecc66757c36cbe1bdf089d4] iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
> git bisect bad 73ce2046e04ad488cecc66757c36cbe1bdf089d4
> # good: [f624c94ad56b663693a9413d8c8c03635435f369] drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component
> git bisect good f624c94ad56b663693a9413d8c8c03635435f369
> # bad: [87c4896d5dd7fd9927c814cf3c6289f41de3b562] firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
> git bisect bad 87c4896d5dd7fd9927c814cf3c6289f41de3b562
> # good: [9b61971cab92a918cab7168d439a351b1c799aca] usbnet: smsc95xx: Avoid link settings race on interrupt reception
> git bisect good 9b61971cab92a918cab7168d439a351b1c799aca
> # bad: [e81395cfbe62518f41af79a1b287fc8fe7c96b37] usbnet: smsc95xx: Fix deadlock on runtime resume
> git bisect bad e81395cfbe62518f41af79a1b287fc8fe7c96b37
> # bad: [7eea9a60703caf1b04eccba93cd103f1c8fc6809] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
> git bisect bad 7eea9a60703caf1b04eccba93cd103f1c8fc6809
> # first bad commit: [7eea9a60703caf1b04eccba93cd103f1c8fc6809] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
> -------------------------------------------------------------------------------
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#30878): https://groups.io/g/kernelci-results/message/30878
> Mute This Topic: https://groups.io/mt/93235418/1131744
> Group Owner: kernelci-results+owner@xxxxxxxxx
> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@xxxxxxxxxx]
> -=-=-=-=-=-=-=-=-=-=-=-
> 
> 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux