Re: [REGRESSION] usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2024-10-03 14:23, Harry Wentland wrote:
> 
> 
> On 2024-10-03 09:47, Mika Westerberg wrote:
>> On Thu, Oct 03, 2024 at 08:42:21AM -0500, Mario Limonciello wrote:
>>> On 10/3/2024 08:27, Mika Westerberg wrote:
>>>> On Thu, Oct 03, 2024 at 08:10:11AM -0500, Mario Limonciello wrote:
>>>>> On 10/3/2024 00:47, Mika Westerberg wrote:
>>>>>> Hi Harry,
>>>>>>
>>>>>> On Wed, Oct 02, 2024 at 01:42:29PM -0400, Harry Wentland wrote:
>>>>>>> I was checking out the 6.12 rc1 (through drm-next) kernel and found
>>>>>>> my system hung at boot. No meaningful message showed on the kernel
>>>>>>> boot screen.
>>>>>>>
>>>>>>> A bisect revealed the culprit to be
>>>>>>>
>>>>>>> commit f1bfb4a6fed64de1771b43a76631942279851744 (HEAD)
>>>>>>> Author: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
>>>>>>> Date:   Fri Aug 30 18:26:29 2024 +0300
>>>>>>>
>>>>>>>       usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface
>>>>>>>
>>>>>>> A revert of this single patch "fixes" the issue and I can boot again.
>>>>>>> The system in question is a Thinkpad T14 with a Ryzen 7 PRO 6850U CPU.
>>>>>>> It's running Arch Linux but I doubt that's of consequence.
>>>>>>>
>>>>>>> lspci output:
>>>>>>>       https://gist.github.com/hwentland/59aef63d9b742b7b64d2604aae9792e0
>>>>>>> acpidump:
>>>>>>>       https://gist.github.com/hwentland/4824afc8d712c3d600be5c291f7f1089
>>>>>>>
>>>>>>> Mario suggested I try modprobe.blacklist=xhci-hcd but that did nothing.
>>>>>>> Another suggestion to do usbcore.nousb lets me boot to the desktop
>>>>>>> on a kernel with the faulty patch, without USB functionality, obviously.
>>>>>>>
>>>>>>> I'd be happy to try any patches, provide more data, or run experiments.
>>>>>>
>>>>>> Do you boot with any device connected?
> 
> Great question. A Thinkpad USB-C dock. When I unplug the dock at boot it
> boots fine and when I plug it in later the laptop charges from it and the
> dock's audio output work fine.
> 
> In the midst of my experiments I also noticed at one point the dock
> wasn't charging my laptop and hard-resetting the laptop didn't fix that.
> I had to unplug the dock from the wall and plug it back. So there is
> likely some interaction going on with this particular dock that must've
> sent the dock's FW into a bad state.
> 
> The dmesg with the revert and thunderbolt.dyndbg=+p is here
> https://gist.github.com/hwentland/7e25dedd3e707fdae1185d65224d4d66
> 

Apologies, that dmesg was from a build with a bad .config and has some
FW loading errors. They seem to be unrelated though. This is a dmesg
from a good build. It still has a wlan FW error but that shouldn't have
anything to do with the problem at hand.

https://gist.github.com/hwentland/867f7afbf3df20547a877e794a8d8e6b

> I don't see any PCIe tunneling option in my BIOS.
> 
>>>>>>> Second thing that I noticed, though I'm not familiar with AMD hardware,
>>>>>> but from your lspci dump, I do not see the PCIe ports that are being
>>>>>> used to tunnel PCIe. Does this system have PCIe tunneling disabled
>>>>>> somehow?
>>>>>
>>>>> On some OEM systems it's possible to lock down from BIOS to turn off PCIe
>>>>> tunneling, and I agree that looks like the most common cause.
>>>>>
>>>>> This is what you would see on a system that has tunnels (I checked on my
>>>>> side w/ Z series laptop w/ Rembrandt and a dock connected):
>>>>>
>>>>>             +-03.0
>>>>>             +-03.1-[03-32]--
>>>>>             +-04.0
>>>>>             +-04.1-[33-62]----00.0-[34-62]--+-02.0-[35]----00.0
>>>>>             |                               \-04.0-[36-62]--
>>>>>
>>>>> 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
>>>>> 17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
>>>>> 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
>>>>> USB4/Thunderbolt PCIe tunnel [1022:14cd]
>>>>> 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
>>>>> 17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
>>>>> 00:04.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
>>>>> USB4/Thunderbolt PCIe tunnel [1022:14cd]
>>>>
>>>> Okay this is more like what I expected, although probably not the
>>>> reason here.
>>>>
>>>> Are you able to replicate the issue if you disable PCIe tunneling from
>>>> the BIOS on your reference system? (Probably not but just in case).
>>>
>>> I checked on the Lenovo Z13 laptop I have and turned off "USB port" in BIOS
>>> setup and this caused the endpoints 3.1 and 4.1 I listed above to disappear
>>> but the system still boots up just fine for me on 6.12-rc1.
>>
>> Okay thanks for checking!
>>
>>>>>> You don't see anything on the console? It's all blank or it just hangs
>>>>>> after some messages?
>>>>>
> 
> It hangs after some messages.
> 
>>>>> I guess it is getting stuck on fwnode_find_reference() because it never
>>>>> finds the given node?
>>>>
>>>> Looking at the code, I don't see where it could get stuck. If for some
>>>> reason there is no such reference (there is based on the ACPI dump) then
>>>> it should not affect the boot. It only matters when power management is
>>>> involved.
>>>
>>> Nothing jumps out to me either.  Maybe this is a situation that Harry can
>>> sprinkle a bunch of printk's all over usb_acpi_add_usb4_devlink() to
>>> enlighten what's going on (assuming the console output is "working" when
>>> this happened).
>>

I sprinkled printks but don't see any on the console.

Harry

>> There are couple of places there that may cause it to crash, I think.
>> And the __free() magic is something I cannot wrap my head around :(
>>
>> Anyways, Harry can you try the below patch and see if it makes any
>> difference? Also if it does please provide dmesg.
>>
> 
> The patch doesn't seem to make a difference. Same hang on boot.
> 
> Harry
> 
>> diff --git a/drivers/usb/core/usb-acpi.c b/drivers/usb/core/usb-acpi.c
>> index 21585ed89ef8..90360f7ca905 100644
>> --- a/drivers/usb/core/usb-acpi.c
>> +++ b/drivers/usb/core/usb-acpi.c
>> @@ -157,6 +157,7 @@ EXPORT_SYMBOL_GPL(usb_acpi_set_power_state);
>>   */
>>  static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
>>  {
>> +	struct fwnode_handle *nhi_fwnode;
>>  	const struct device_link *link;
>>  	struct usb_port *port_dev;
>>  	struct usb_hub *hub;
>> @@ -165,11 +166,12 @@ static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
>>  		return 0;
>>  
>>  	hub = usb_hub_to_struct_hub(udev->parent);
>> -	port_dev = hub->ports[udev->portnum - 1];
>> +	if (WARN_ON(!hub))
>> +		return 0;
>>  
>> -	struct fwnode_handle *nhi_fwnode __free(fwnode_handle) =
>> -		fwnode_find_reference(dev_fwnode(&port_dev->dev), "usb4-host-interface", 0);
>> +	port_dev = hub->ports[udev->portnum - 1];
>>  
>> +	nhi_fwnode = fwnode_find_reference(dev_fwnode(&port_dev->dev), "usb4-host-interface", 0);
>>  	if (IS_ERR(nhi_fwnode))
>>  		return 0;
>>  
>> @@ -180,12 +182,14 @@ static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
>>  	if (!link) {
>>  		dev_err(&port_dev->dev, "Failed to created device link from %s to %s\n",
>>  			dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
>> +		fwnode_handle_put(nhi_fwnode);
>>  		return -EINVAL;
>>  	}
>>  
>> -	dev_dbg(&port_dev->dev, "Created device link from %s to %s\n",
>> -		dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
>> +	dev_info(&port_dev->dev, "Created device link from %s to %s\n",
>> +		 dev_name(&port_dev->child->dev), dev_name(nhi_fwnode->dev));
>>  
>> +	fwnode_handle_put(nhi_fwnode);
>>  	return 0;
>>  }
>>  
>>
> 





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux