Re: [REGRESSION] usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2024-10-10 08:01, Mathias Nyman wrote:
> On 10.10.2024 5.23, Mario Limonciello wrote:
>> On 10/9/2024 16:52, Mathias Nyman wrote:
>>> On 3.10.2024 16.47, Mika Westerberg wrote:
>>>> On Thu, Oct 03, 2024 at 08:42:21AM -0500, Mario Limonciello wrote:
>>>>> On 10/3/2024 08:27, Mika Westerberg wrote:
>>>>>> On Thu, Oct 03, 2024 at 08:10:11AM -0500, Mario Limonciello wrote:
>>>>>>> On 10/3/2024 00:47, Mika Westerberg wrote:
>>>>>>>> Hi Harry,
>>>>>>>>
>>>>>>>> On Wed, Oct 02, 2024 at 01:42:29PM -0400, Harry Wentland wrote:
>>>>>>>>> I was checking out the 6.12 rc1 (through drm-next) kernel and found
>>>>>>>>> my system hung at boot. No meaningful message showed on the kernel
>>>>>>>>> boot screen.
>>>>>>>>>
>>>>>>>>> A bisect revealed the culprit to be
>>>>>>>>>
>>>>>>>>> commit f1bfb4a6fed64de1771b43a76631942279851744 (HEAD)
>>>>>>>>> Author: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
>>>>>>>>> Date:   Fri Aug 30 18:26:29 2024 +0300
>>>>>>>>>
>>>>>>>>>        usb: acpi: add device link between tunneled USB3 device and USB4 Host Interface
>>>>>>>>>
>>>>>>>>> A revert of this single patch "fixes" the issue and I can boot again.
>>>>>>>>> The system in question is a Thinkpad T14 with a Ryzen 7 PRO 6850U CPU.
>>>>>>>>> It's running Arch Linux but I doubt that's of consequence.
>>>>>>>>>
>>>>>>>>> lspci output:
>>>>>>>>>        https://gist.github.com/ hwentland/59aef63d9b742b7b64d2604aae9792e0
>>>>>>>>> acpidump:
>>>>>>>>>        https://gist.github.com/ hwentland/4824afc8d712c3d600be5c291f7f1089
>>>>>>>>>
>>>>>>>>> Mario suggested I try modprobe.blacklist=xhci-hcd but that did nothing.
>>>>>>>>> Another suggestion to do usbcore.nousb lets me boot to the desktop
>>>>>>>>> on a kernel with the faulty patch, without USB functionality, obviously.
>>>>>>>>>
>>>>>>>>> I'd be happy to try any patches, provide more data, or run experiments.
>>>>>>>>
>>>>>>>> Do you boot with any device connected?
>>>>>>>>> Second thing that I noticed, though I'm not familiar with AMD hardware,
>>>>>>>> but from your lspci dump, I do not see the PCIe ports that are being
>>>>>>>> used to tunnel PCIe. Does this system have PCIe tunneling disabled
>>>>>>>> somehow?
>>>>>>>
>>>>>>> On some OEM systems it's possible to lock down from BIOS to turn off PCIe
>>>>>>> tunneling, and I agree that looks like the most common cause.
>>>>>>>
>>>>>>> This is what you would see on a system that has tunnels (I checked on my
>>>>>>> side w/ Z series laptop w/ Rembrandt and a dock connected):
>>>>>>>
>>>>>>>              +-03.0
>>>>>>>              +-03.1-[03-32]--
>>>>>>>              +-04.0
>>>>>>>              +-04.1-[33-62]----00.0-[34-62]--+-02.0-[35]----00.0
>>>>>>>              |                               \-04.0-[36-62]--
>>>>>>>
>>>>>>> 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
>>>>>>> 17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
>>>>>>> 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
>>>>>>> USB4/Thunderbolt PCIe tunnel [1022:14cd]
>>>>>>> 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family
>>>>>>> 17h-19h PCIe Dummy Host Bridge [1022:14b7] (rev 01)
>>>>>>> 00:04.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 19h
>>>>>>> USB4/Thunderbolt PCIe tunnel [1022:14cd]
>>>>>>
>>>>>> Okay this is more like what I expected, although probably not the
>>>>>> reason here.
>>>>>>
>>>>>> Are you able to replicate the issue if you disable PCIe tunneling from
>>>>>> the BIOS on your reference system? (Probably not but just in case).
>>>>>
>>>>> I checked on the Lenovo Z13 laptop I have and turned off "USB port" in BIOS
>>>>> setup and this caused the endpoints 3.1 and 4.1 I listed above to disappear
>>>>> but the system still boots up just fine for me on 6.12-rc1.
>>>>
>>>> Okay thanks for checking!
>>>>
>>>>>>>> You don't see anything on the console? It's all blank or it just hangs
>>>>>>>> after some messages?
>>>>>>>
>>>>>>> I guess it is getting stuck on fwnode_find_reference() because it never
>>>>>>> finds the given node?
>>>>>>
>>>>>> Looking at the code, I don't see where it could get stuck. If for some
>>>>>> reason there is no such reference (there is based on the ACPI dump) then
>>>>>> it should not affect the boot. It only matters when power management is
>>>>>> involved.
>>>>>
>>>>> Nothing jumps out to me either.  Maybe this is a situation that Harry can
>>>>> sprinkle a bunch of printk's all over usb_acpi_add_usb4_devlink() to
>>>>> enlighten what's going on (assuming the console output is "working" when
>>>>> this happened).
>>>>
>>>> There are couple of places there that may cause it to crash, I think.
>>>
>>> Its possible we end up trying to create a device link during usb3 device
>>> "consumer" enumeration before the "supplier" NHI device is properly bound to a driver.
>>>
>>> This is something driver-api/device_link.rst states can cause issues.
>>>
>>> This could happen if xhci isn't capable of detecting tunneled devices,
>>> but ACPI tables contain all info needed to assume device might be tunneled.
>>> i.e. udev->tunnel_mode == USB_LINK_UNKNOWN.
>>>
>>> Harry, could you test if the code below helps?
>>>
>>> diff --git a/drivers/usb/core/usb-acpi.c b/drivers/usb/core/usb-acpi.c
>>> index 21585ed89ef8..94c335a7b933 100644
>>> --- a/drivers/usb/core/usb-acpi.c
>>> +++ b/drivers/usb/core/usb-acpi.c
>>> @@ -173,6 +173,13 @@ static int usb_acpi_add_usb4_devlink(struct usb_device *udev)
>>>          if (IS_ERR(nhi_fwnode))
>>>                  return 0;
>>>
>>> +       if (!nhi_fwnode->dev || !device_is_bound(nhi_fwnode->dev)) {
>>> +               dev_info(&port_dev->dev, "%s not tunneled as it probed before USB4 Host Interface\n",
>>
>> I'm aware this message is mostly to prove whether this is the actual issue but I do want to say if this patch indeed helps Harry's problem and you keep a message in what goes upstream I don't think this is accurate for all cases.
>>
>> If you have a Pre-OS CM, it might build tunnels and those could be active until the USB4 CM loads and resets them (by the default behavior).
>>
>> So I think a more accurate message would just be "%s probed before USB4 host interface".
> 
> Makes sense, I'll tune the message in the final patch if this works
> 

Apologies for the late response. I was traveling last week.

This patch does the trick, i.e., no more hangs on boot when
connected to the Lenovo USB dock.

Harry


> Thanks
> Mathias
> 





[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux