Re: Recovering from transaction errors [was: Re: [syzbot] INFO: rcu detected stall in tx]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alan Stern wrote:
> On Thu, May 20, 2021 at 09:23:57PM +0000, Thinh Nguyen wrote:
>> Alan Stern wrote:
>>>> If the cable is unplugged, then we should get a connection change event
>>>> and the driver can handle it properly.
>>>
>>> Yes -- unless the driver is in such a tight retry loop that the rest of 
>>> the system never gets a chance to process the connection change event.  
>>> I've seen bug reports where that happened.
>>
>> I see. I'll keep that in mind, but it sounds like HW issue? The driver
>> handles retry base on events generated from the HW and the HW should
>> properly generate connection event and not get stuck in some loop.
> 
> The hardware _does_ generate disconnect events.  The problem is that the 
> class driver doesn't react properly to transaction errors and thereby 
> prevents the rest of the system from handling the disconnect events.  
> It's a bug in the class driver, not in the hardware.

Ok. Got it.

> 
>>>>> For the case in question (the syzbot bug report that started this 
>>>>> thread), the class driver doesn't try to perform any recovery.  It just 
>>>>> resubmits the URB, getting into a tight retry loop which consumes too 
>>>>> much CPU time.  Simply giving up would be preferable.
>>>>>
>>>>> Alan Stern
>>>>>
>>>>
>>>> I see. By giving up, you mean doing port reset right? Otherwise it needs
>>>> some other mechanism to synchronize with the device side.
>>>
>>> No, I mean the driver should just stop communicating with the device.  
>>> That's an appropriate action for lots of drivers.  If the user wants to 
>>> re-synchronize with the device, he can unplug the USB cable and plug it 
>>> back in again.
>>>
>>> Alan Stern
>>>
>>
>> Ok. Would it be more difficult to automate this if it requires user
>> intervention? I assume syzbot doesn't want the user to do that.
> 
> Difficult to automate what, exactly?  Unplugging the USB cable?  How 
> could you possibly automate that?
> 
> At the moment, I think the best approach is Guido's suggestion to reject 
> URBs submitted to endpoints that have gotten a transaction error, until 
> the error status has somehow been cleared.  Is that what you would like 
> to see automated?
> 

First, just want to point out that I'm not familiar with syzbot. I was
just thinking if this issue occurs, and if the user wants to start a new
test, then she doesn't have to unplug+plug the device back and allow
some application to automatically trigger a new test after a failure.

BR,
Thinh




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux