On 13.8.2024 14.49, Mathias Nyman wrote:
On 11.8.2024 1.11, Karel Balej wrote:
Hello,
my machine crashed twice in the past week, the second time I have been
able to recover the log output (including the stack trace run through
scripts/decode_stacktrace.sh) which seems to suggest a bug in the xHCI
driver:
[44193.556677] usb 2-1-port5: disabled by hub (EMI?), re-enabling...
[44193.556692] usb 2-1.5: USB disconnect, device number 6
[44193.558532] cdc_ncm 2-1.5:1.0 enp0s29u1u5: unregister 'cdc_ncm' usb-0000:00:1d.0-1.5, CDC NCM (NO ZLP)
[44193.739545] usb 2-1.5: new high-speed USB device number 7 using ehci-pci
[44193.819628] usb 2-1.5: New USB device found, idVendor=18d1, idProduct=d001, bcdDevice= 6.10
[44193.819637] usb 2-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[44193.819641] usb 2-1.5: Product: Samsung Galaxy Core Prime VE LTE
[44193.819644] usb 2-1.5: Manufacturer: Samsung
[44193.819646] usb 2-1.5: SerialNumber: postmarketOS
[44193.842472] cdc_ncm 2-1.5:1.0: MAC-Address: [...]
[44193.842770] cdc_ncm 2-1.5:1.0 usb0: register 'cdc_ncm' at usb-0000:00:1d.0-1.5, CDC NCM (NO ZLP), [...]
[44193.845829] cdc_ncm 2-1.5:1.0 enp0s29u1u5: renamed from usb0
[46253.017991] perf: interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[46709.344533] usb 3-1: new full-speed USB device number 3 using xhci_hcd
[46709.458560] usb 3-1: device descriptor read/64, error -71
[46709.679562] usb 3-1: device descriptor read/64, error -71
[46709.895544] usb 3-1: new full-speed USB device number 4 using xhci_hcd
[46710.009563] usb 3-1: device descriptor read/64, error -71
[46710.231579] usb 3-1: device descriptor read/64, error -71
[46710.333629] usb usb3-port1: attempt power cycle
[46710.713538] usb 3-1: new full-speed USB device number 5 using xhci_hcd
[46710.713699] usb 3-1: Device not responding to setup address.
[46710.917684] usb 3-1: Device not responding to setup address.
[46711.125536] usb 3-1: device not accepting address 5, error -71
[46711.125594] BUG: kernel NULL pointer dereference, address: 0000000000000008
[46711.125600] #PF: supervisor read access in kernel mode
[46711.125603] #PF: error_code(0x0000) - not-present page
[46711.125606] PGD 0 P4D 0
[46711.125610] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
[46711.125615] CPU: 1 PID: 25760 Comm: kworker/1:2 Not tainted 6.10.3_2 #1
[46711.125620] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77-D3H, BIOS F18 08/21/2012
[46711.125623] Workqueue: usb_hub_wq hub_event [usbcore]
[46711.125668] RIP: 0010:xhci_reserve_bandwidth (drivers/usb/host/xhci.c:2392 drivers/usb/host/xhci.c:2758) xhci_hcd
Thanks for the report.
You have a unlucky setup here.
This could only happen when a full speed device fails enumeration while connected to a
Pantherpoint xHC.
Only Pantherpoint xHC (PCI_ID 0x1e31) does bandwidth calculation in software and
calls xhci_reserve_bandwidth(). In this case we unintentionally end up calling it
after a failed address device attempt when usb core re-inits endpoint 0 before retry.
At this point the xhci side of the device isn't properly allocated or set up so
we hit a NULL pointer dereference.
I'll look into it more.
The following code should resolve this issue, any chance you could try it out?
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 9a8627e42898..a69245074395 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -2837,7 +2837,7 @@ static int xhci_configure_endpoint(struct xhci_hcd *xhci,
xhci->num_active_eps);
return -ENOMEM;
}
- if ((xhci->quirks & XHCI_SW_BW_CHECKING) &&
+ if ((xhci->quirks & XHCI_SW_BW_CHECKING) && !ctx_change &&
xhci_reserve_bandwidth(xhci, virt_dev, command->in_ctx)) {
if ((xhci->quirks & XHCI_EP_LIMIT_QUIRK))
xhci_free_host_resources(xhci, ctrl_ctx);
@@ -4200,8 +4200,10 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev,
mutex_unlock(&xhci->mutex);
ret = xhci_disable_slot(xhci, udev->slot_id);
xhci_free_virt_device(xhci, udev->slot_id);
- if (!ret)
- xhci_alloc_dev(hcd, udev);
+ if (!ret) {
+ if (xhci_alloc_dev(hcd, udev) == 1)
+ xhci_setup_addressable_virt_dev(xhci, udev);
+ }
kfree(command->completion);
kfree(command);
return -EPROTO;
Thanks
Mathias