Re: Highly critical bug in XHCI Controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2024-11-17 at 10:18 -0500, Alan Stern wrote:
> On Sun, Nov 17, 2024 at 08:44:16PM +0800, Markus Rechberger wrote:
> > Basically the issue comes from hub_port_connect.
> > 
> > drivers/usb/core/hub.c
> > 
> > hub_port_init returns -71 -EPROTO and jumps to loop
> > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L5450
> > 
> > I'd question if usb_ep0_reinit is really required in loop which is
> > running following functions:
> 
> You mean that usb_ep0_reinit() runs the following, not that the loop 
> does.
> 
> >     usb_disable_endpoint(udev, 0 + USB_DIR_IN, true);
> >     usb_disable_endpoint(udev, 0 + USB_DIR_OUT, true);
> >     usb_enable_endpoint(udev, &udev->ep0, true);
> > 
> > this is something only experience over the past decades can tell?
> 
> It _is_ necessary, because the maxpacket size of ep0 may change from
> one loop iteration to the next.  Therefore the endpoint must be
> disabled 
> and re-enabled each time the loop repeats.
> 
> [Now that I go back through the git log, it appears the only reason
> for 
> exporting usb_ep0_reinit was so that the WUSB driver could call it --
> see commit fc721f5194dc ("wusb: make ep0_reinit available for
> modules").  
> Since the kernel doesn't support WUSB any more, we should be able to 
> stop exporting that function.]

This should only go down there in case an error happened earlier no?
In case the hardware signed up correctly it should not even enter that
code.

My experience is just - reconnect the device in case an error happened
those
workarounds did not work properly for the device I deal with (but yes
that's
why I'm asking - maybe someone else has different hardware with
different
experience).

> 
> > usb_enable_endpoint will trigger xhci_endpoint_reset which doesn't
> > do
> > much, but crashes the entire system with the upstream kernel when
> > it
> > triggers xhci_check_bw_table).
> > 
> > I removed usb_ep0_reinit here and devices are still workable under
> > various conditions (again I shorted and pulled D+/D- to ground for
> > testing).
> > The NULL PTR check in xhci_check_bw_table would be a second line of
> > defense but as indicated in the first mail it shouldn't even get
> > there.
> > 
> > 
> > 
> > As a second issue I found in usb_reset_and_verify device 
> > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L6131
> > 
> >         ret = hub_port_init(parent_hub, udev, port1, i,
> > &descriptor);
> >         if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV) {
> >             break;
> >         }
> > 
> > hub_port_init can also return -71 / -EPROTO, the cases should be
> > very
> > rare when usb_reset_and_verify_device is triggered and that
> > happens.
> 
> If that happens, the loop which this code sits inside will simply 
> perform another iteration.  That's what  it's supposed to do, not an 
> issue at all.
> 

It doesn't cause any issue yes but it's not correct either.
-EPROTO will be returned if I disconnect or short the device here. So
initially someone
thought he should check for -ENODEV/-ENOTCONN (which should also
indicate that the device is gone), so -EPROTO should also be checked in
that case.
Otherwise just remove all those error checks.

> Alan Stern
> 






[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux