Re: Userspace enumeration hang while btusb tries to load firmware of removed device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 04, 2021 at 10:34:22AM +0100, Benjamin Berg wrote:
> Hi Marcel and Alan,
> 
> On Wed, 2021-11-03 at 20:31 +0100, Marcel Holtmann wrote:
> > > I'm not familiar with the btusb driver, so someone on the 
> > > linux-bluetooth mailing list would have a better idea about this. 
> > > However, it does look as though btusb keeps the device locked during the 
> > > entire 10-second period while it tries to send over the firmware, and it 
> > > doesn't abort the procedure when it starts getting disconnection errors 
> > > but instead persists until a timeout expires.  Keeping the device locked 
> > > would certainly block lsusb.
> > > 
> > > In general, locking the device during a firmware upload seems like
> > > the right thing to do -- you don't want extraneous transfers from
> > > other processes messing up the firmware!  So overall, it appears that
> > > the whole problem would be solved if the firmware transfer were
> > > aborted as soon as the -ENODEV errors start appearing.
> > 
> > the problem seems to be that we hitting HCI command timeout. So the
> > firmware download is done via HCI commands. These commands are send
> > to the transport driver btusb.c via hdev->send (as btusb_send_frame).
> > This triggers the usb_submit_urb or queues them via data->deferred
> > anchor. All this reports back the error properly except that nobody
> > does anything with it.
> > 
> > See hci_send_frame() last portion:
> > 
> >         err = hdev->send(hdev, skb);                                             
> >         if (err < 0) {                                                           
> >                 bt_dev_err(hdev, "sending frame failed (%d)", err);              
> >                 kfree_skb(skb);                                                  
> >         }
> > 
> > And that is it. We are not checking for ENODEV or any error here.
> > That means the failure of the HCI command gets only caught via the
> > HCI command timeout. I don’t know how to do this yet, but you would
> > have to look there to fail HCI command right away instead of waiting
> > for the timeout.
> 
> Hmm, true, I don't see a "sending frame failed" error message during
> the firmware download though.

It is in the log you posted:

[Mi Nov 3 11:55:23 2021] Bluetooth: hci0: Failed to send firmware data (-110)
[Mi Nov 3 11:55:23 2021] Bluetooth: hci0: sending frame failed (-19)

But this occurred after the timeout, so maybe you had in mind something 
occurring earlier.

>  You are right that this codepath is
> loosing the error, but this does not seem to be the scenario we are
> running into while loading the firmware. This error only happens later
> on from the btintel_reset_to_bootloader function.
> 
> What seems to happen in the posted log is that the URB is initially
> submitted just fine and the transfer errors out afterwards.
> Unfortunately, the btusb_tx_complete is only used for statistics
> (stat.err_tx is increased) and has no further error handling that could
> abort the firmware upload.

While detecting the errors during URB completion would be nice, it isn't 
necessary.  Things would work just as well if the disconnect error were 
detected during submission of the following URB.

Alan Stern



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux