Re: [REGRESSION] usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/3/2024 13:51, Harry Wentland wrote:


On 2024-10-03 14:23, Harry Wentland wrote:


On 2024-10-03 09:47, Mika Westerberg wrote:
On Thu, Oct 03, 2024 at 08:42:21AM -0500, Mario Limonciello wrote:
On 10/3/2024 08:27, Mika Westerberg wrote:
On Thu, Oct 03, 2024 at 08:10:11AM -0500, Mario Limonciello wrote:
On 10/3/2024 00:47, Mika Westerberg wrote:
Hi Harry,

On Wed, Oct 02, 2024 at 01:42:29PM -0400, Harry Wentland wrote:
I was checking out the 6.12 rc1 (through drm-next) kernel and found
my system hung at boot. No meaningful message showed on the kernel
boot screen.

A bisect revealed the culprit to be

commit f1bfb4a6fed64de1771b43a76631942279851744 (HEAD)
Author: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
Date:   Fri Aug 30 18:26:29 2024 +0300

       usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface

A revert of this single patch "fixes" the issue and I can boot again.
The system in question is a Thinkpad T14 with a Ryzen 7 PRO 6850U CPU.
It's running Arch Linux but I doubt that's of consequence.

lspci output:
       https://gist.github.com/hwentland/59aef63d9b742b7b64d2604aae9792e0
acpidump:
       https://gist.github.com/hwentland/4824afc8d712c3d600be5c291f7f1089

Mario suggested I try modprobe.blacklist=xhci-hcd but that did nothing.
Another suggestion to do usbcore.nousb lets me boot to the desktop
on a kernel with the faulty patch, without USB functionality, obviously.

I'd be happy to try any patches, provide more data, or run experiments.

Do you boot with any device connected?

Great question. A Thinkpad USB-C dock. When I unplug the dock at boot it
boots fine and when I plug it in later the laptop charges from it and the
dock's audio output work fine.

In the midst of my experiments I also noticed at one point the dock
wasn't charging my laptop and hard-resetting the laptop didn't fix that.
I had to unplug the dock from the wall and plug it back. So there is
likely some interaction going on with this particular dock that must've
sent the dock's FW into a bad state.

The dmesg with the revert and thunderbolt.dyndbg=+p is here
https://gist.github.com/hwentland/7e25dedd3e707fdae1185d65224d4d66


Apologies, that dmesg was from a build with a bad .config and has some
FW loading errors. They seem to be unrelated though. This is a dmesg
from a good build. It still has a wlan FW error but that shouldn't have
anything to do with the problem at hand.

https://gist.github.com/hwentland/867f7afbf3df20547a877e794a8d8e6b

I don't see any PCIe tunneling option in my BIOS.

Second thing that I noticed, though I'm not familiar with AMD hardware,
but from your lspci dump, I do not see the PCIe ports that are being
used to tunnel PCIe. Does this system have PCIe tunneling disabled
somehow?

On some OEM systems it's possible to lock down from BIOS to turn off PCIe
tunneling, and I agree that looks like the most common cause.

This is what you would see on a system that has tunnels (I checked on my
side w/ Z series laptop w/ Rembrandt and a dock connected):

             +-03.0
             +-03.1-[03-32]--
             +-04.0
             +-04.1-[33-62]----00.0-[34-62]--+-02.0-[35]----00.0
             |                               \-04.0-[36-62]--

00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
USB4/Thunderbolt PCIe tunnel [1022:14cd]
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
00:04.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
USB4/Thunderbolt PCIe tunnel [1022:14cd]

Okay this is more like what I expected, although probably not the
reason here.

Are you able to replicate the issue if you disable PCIe tunneling from
the BIOS on your reference system? (Probably not but just in case).

I checked on the Lenovo Z13 laptop I have and turned off "USB port" in BIOS
setup and this caused the endpoints 3.1 and 4.1 I listed above to disappear
but the system still boots up just fine for me on 6.12-rc1.

Okay thanks for checking!

You don't see anything on the console? It's all blank or it just hangs
after some messages?


It hangs after some messages.

I guess it is getting stuck on fwnode_find_reference() because it never
finds the given node?

Looking at the code, I don't see where it could get stuck. If for some
reason there is no such reference (there is based on the ACPI dump) then
it should not affect the boot. It only matters when power management is
involved.

Nothing jumps out to me either.  Maybe this is a situation that Harry can
sprinkle a bunch of printk's all over usb_acpi_add_usb4_devlink() to
enlighten what's going on (assuming the console output is "working" when
this happened).


I sprinkled printks but don't see any on the console.


You said it can work properly without the revert if you don't boot with the dock plugged in?

How about if you unplug it, does unhang and you get everything flushed to the console?

Or maybe magic sysrq with a backtrace (l) can help see where something is spinning.

Harry

There are couple of places there that may cause it to crash, I think.
And the __free() magic is something I cannot wrap my head around :(

Anyways, Harry can you try the below patch and see if it makes any
difference? Also if it does please provide dmesg.


The patch doesn't seem to make a difference. Same hang on boot.

Harry

diff --git a/drivers/usb/core/usb-acpi.c b/drivers/usb/core/usb-acpi.c
index 21585ed89ef8..90360f7ca905 100644
--- a/drivers/usb/core/usb-acpi.c
+++ b/drivers/usb/core/usb-acpi.c
@@ -157,6 +157,7 @@ EXPORT_SYMBOL_GPL(usb_acpi_set_power_state);
   */
  static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
  {
+	struct fwnode_handle *nhi_fwnode;
  	const struct device_link *link;
  	struct usb_port *port_dev;
  	struct usb_hub *hub;
@@ -165,11 +166,12 @@ static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
  		return 0;
hub = usb_hub_to_struct_hub(udev->parent);
-	port_dev = hub->ports[udev->portnum - 1];
+	if (WARN_ON(!hub))
+		return 0;
- struct fwnode_handle *nhi_fwnode __free(fwnode_handle) =
-		fwnode_find_reference(dev_fwnode(&port_dev->dev), "usb4-host-interface", 0);
+	port_dev = hub->ports[udev->portnum - 1];
+ nhi_fwnode = fwnode_find_reference(dev_fwnode(&port_dev->dev), "usb4-host-interface", 0);
  	if (IS_ERR(nhi_fwnode))
  		return 0;
@@ -180,12 +182,14 @@ static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
  	if (!link) {
  		dev_err(&port_dev->dev, "Failed to created device link from %s to %s\n",
  			dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
+		fwnode_handle_put(nhi_fwnode);
  		return -EINVAL;
  	}
- dev_dbg(&port_dev->dev, "Created device link from %s to %s\n",
-		dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
+	dev_info(&port_dev->dev, "Created device link from %s to %s\n",
+		 dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
+ fwnode_handle_put(nhi_fwnode);
  	return 0;
  }







[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux