Basically the issue comes from hub_port_connect. drivers/usb/core/hub.c hub_port_init returns -71 -EPROTO and jumps to loop https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L5450 I'd question if usb_ep0_reinit is really required in loop which is running following functions: usb_disable_endpoint(udev, 0 + USB_DIR_IN, true); usb_disable_endpoint(udev, 0 + USB_DIR_OUT, true); usb_enable_endpoint(udev, &udev->ep0, true); this is something only experience over the past decades can tell? usb_enable_endpoint will trigger xhci_endpoint_reset which doesn't do much, but crashes the entire system with the upstream kernel when it triggers xhci_check_bw_table). I removed usb_ep0_reinit here and devices are still workable under various conditions (again I shorted and pulled D+/D- to ground for testing). The NULL PTR check in xhci_check_bw_table would be a second line of defense but as indicated in the first mail it shouldn't even get there. As a second issue I found in usb_reset_and_verify device https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L6131 ret = hub_port_init(parent_hub, udev, port1, i, &descriptor); if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV) { break; } hub_port_init can also return -71 / -EPROTO, the cases should be very rare when usb_reset_and_verify_device is triggered and that happens. I'm just waiting for comments now since this is some critical piece of infrastructure code before proceeding with a patch. On Sun, 2024-11-17 at 15:33 +0800, Markus Rechberger wrote: > Hi, > > > the issue was first reported at vdr-portal.de > https://www-vdr--portal-de.translate.goog/forum/index.php?thread/136541-empfehlung-dvb-s2-tuner-oder-satip/&postID=1376196&_x_tr_sl=de&_x_tr_tl=en&_x_tr_hl=de&_x_tr_pto=wapp#post1376196 > > we've got around a highly critical bug in the xhci driver. > > https://sundtek.de/support/uxvd32.txt > > In xhci.c > > The bug is still active in Mainline: > https://github.com/torvalds/linux/blob/master/drivers/usb/host/xhci.c#L2382 > > static int xhci_check_bw_table(struct xhci_hcd *xhci, > struct xhci_virt_device *virt_dev, > int old_active_eps) > > bw_table can end up with a NULL pointer. > > This problem will lead to a complete kernel crash, rendering the > entire > system unusable without any access to the actual linux system. > > How to trigger the problem? > Short D+/D- or pull them to ground on a USB device while connecting > the > device. > > The problem can happen due to following cases: > * a device is getting suddenly disconnected during enumeration > * a faulty cable > * a faulty device > * a malicious device triggers this issue on purpose > * if there are electrical issues during connecting a device. > > A quick hotfix would be to check if bw_table is NULL in > xhci_check_bw_table, however the check should be performed earlier - > in > the area where bw_table is supposed to be assigned. > > Best Regards, > Markus