On 14.8.2024 16.28, Mathias Nyman wrote:
On 13.8.2024 14.49, Mathias Nyman wrote:
On 11.8.2024 1.11, Karel Balej wrote:
Hello,
my machine crashed twice in the past week, the second time I have been
able to recover the log output (including the stack trace run through
scripts/decode_stacktrace.sh) which seems to suggest a bug in the xHCI
driver:
You have a unlucky setup here.
This could only happen when a full speed device fails enumeration while connected to a
Pantherpoint xHC.
Only Pantherpoint xHC (PCI_ID 0x1e31) does bandwidth calculation in software and
calls xhci_reserve_bandwidth(). In this case we unintentionally end up calling it
after a failed address device attempt when usb core re-inits endpoint 0 before retry.
At this point the xhci side of the device isn't properly allocated or set up so
we hit a NULL pointer dereference.
I'll look into it more.
The following code should resolve this issue, any chance you could try it out?
I was able to trigger this myself by forcing XHCI_SW_BW_CHECKING and faking failure on
address device command:
[ 270.538134] usb 3-6: new full-speed USB device number 3 using xhci_hcd
[ 270.670313] xhci_hcd 0000:00:14.0: Faking a Device not respoinding to setup address
[ 270.886142] usb 3-6: device not accepting address 3, error -71
[ 270.892091] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 270.899034] #PF: supervisor read access in kernel mode
[ 270.904150] #PF: error_code(0x0000) - not-present page
[ 270.909267] PGD 0 P4D 0
[ 270.911799] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 270.916660] CPU: 3 UID: 0 PID: 301 Comm: kworker/3:2 Tainted: G W 6.11.0-rc1+ #4291
[ 270.925651] Tainted: [W]=WARN
[ 270.928615] Workqueue: usb_hub_wq hub_event
[ 270.932787] RIP: 0010:xhci_reserve_bandwidth+0x243/0x6d0 [xhci_hcd]
The codesnippet I suggested did fix the null pointer dereference.
I'll turn it into a proper patch
Thanks
Mathias