Re: Highly critical bug in XHCI Controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 17, 2024 at 11:47:52PM +0800, Markus Rechberger wrote:
> On Sun, 2024-11-17 at 10:18 -0500, Alan Stern wrote:
> > On Sun, Nov 17, 2024 at 08:44:16PM +0800, Markus Rechberger wrote:
> > > Basically the issue comes from hub_port_connect.
> > > 
> > > drivers/usb/core/hub.c
> > > 
> > > hub_port_init returns -71 -EPROTO and jumps to loop
> > > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L5450
> > > 
> > > I'd question if usb_ep0_reinit is really required in loop which is
> > > running following functions:

> This should only go down there in case an error happened earlier no?

Of course, since -EPROTO indicates there was an error.

> In case the hardware signed up correctly it should not even enter that
> code.
> 
> My experience is just - reconnect the device in case an error happened

You can't reconnect some devices; they are permanently attached.  There 
are other reasons why reconnecting might not be practical.

> those
> workarounds did not work properly for the device I deal with (but yes
> that's
> why I'm asking - maybe someone else has different hardware with
> different
> experience).

I doubt we will hear from these people, because they will not realize 
that anything was wrong.

In any case, it is generally recognized that the type of errors leading 
to -EPROTO (bad CRC or no response, for instance) can be temporary or 
intermittent.  Retrying is an appropriate strategy.

> > > As a second issue I found in usb_reset_and_verify device 
> > > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L6131
> > > 
> > >         ret = hub_port_init(parent_hub, udev, port1, i,
> > > &descriptor);
> > >         if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV) {
> > >             break;
> > >         }
> > > 
> > > hub_port_init can also return -71 / -EPROTO, the cases should be
> > > very
> > > rare when usb_reset_and_verify_device is triggered and that
> > > happens.
> > 
> > If that happens, the loop which this code sits inside will simply 
> > perform another iteration.  That's what  it's supposed to do, not an 
> > issue at all.
> > 
> 
> It doesn't cause any issue yes but it's not correct either.

Why not?  What's wrong with it?

> -EPROTO will be returned if I disconnect or short the device here. So
> initially someone
> thought he should check for -ENODEV/-ENOTCONN (which should also
> indicate that the device is gone), so -EPROTO should also be checked in
> that case.
> Otherwise just remove all those error checks.

-EPROTO does not necessarily indicate the device is gone; it indicates 
there was a problem communicating with the device.  The problem might be 
caused by a disconnection, or it might be caused by temporary 
electromagnetic interference (for example), or by something else.

Alan Stern




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux