Re: Recovering from transaction errors [was: Re: [syzbot] INFO: rcu detected stall in tx]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 20, 2021 at 09:23:57PM +0000, Thinh Nguyen wrote:
> Alan Stern wrote:
> >> If the cable is unplugged, then we should get a connection change event
> >> and the driver can handle it properly.
> > 
> > Yes -- unless the driver is in such a tight retry loop that the rest of 
> > the system never gets a chance to process the connection change event.  
> > I've seen bug reports where that happened.
> 
> I see. I'll keep that in mind, but it sounds like HW issue? The driver
> handles retry base on events generated from the HW and the HW should
> properly generate connection event and not get stuck in some loop.

The hardware _does_ generate disconnect events.  The problem is that the 
class driver doesn't react properly to transaction errors and thereby 
prevents the rest of the system from handling the disconnect events.  
It's a bug in the class driver, not in the hardware.

> >>> For the case in question (the syzbot bug report that started this 
> >>> thread), the class driver doesn't try to perform any recovery.  It just 
> >>> resubmits the URB, getting into a tight retry loop which consumes too 
> >>> much CPU time.  Simply giving up would be preferable.
> >>>
> >>> Alan Stern
> >>>
> >>
> >> I see. By giving up, you mean doing port reset right? Otherwise it needs
> >> some other mechanism to synchronize with the device side.
> > 
> > No, I mean the driver should just stop communicating with the device.  
> > That's an appropriate action for lots of drivers.  If the user wants to 
> > re-synchronize with the device, he can unplug the USB cable and plug it 
> > back in again.
> > 
> > Alan Stern
> > 
> 
> Ok. Would it be more difficult to automate this if it requires user
> intervention? I assume syzbot doesn't want the user to do that.

Difficult to automate what, exactly?  Unplugging the USB cable?  How 
could you possibly automate that?

At the moment, I think the best approach is Guido's suggestion to reject 
URBs submitted to endpoints that have gotten a transaction error, until 
the error status has somehow been cleared.  Is that what you would like 
to see automated?

Alan Stern



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux