Re: Userspace enumeration hang while btusb tries to load firmware of removed device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alan,

>> a user is seeing a hang in fprintd while enumerating devices which
>> appears to be caused by an interaction of:
>> 
>> * system is resuming from S3
>> * btusb starts loading firmware
>> * bluetooth device disappears (probably thinkpad_acpi rfkill)
>> * libusb enumerates USB devices (fprintd in this case)
>> 
>> When this happens, the firmware load fails after a timeout of 10s. It
>> appears that if userspace queries information about the root hub in
>> question during this time, it will hang until the btusb firmware load
>> has timed out.
>> 
>> Attaching the full kernel log, below an excerpt, you can see:
>> * At :12 device removal: "usb 5-4: USB disconnect, device number 33"
>> * libusb enumeration retrieves information about the usb5 root hub,
>>   and blocks on this
>> * At :14 there is a tx timeout on hci0
>> * At :23 the firmware load finally fails
>> * Then usb_disable_device happens
>> * libusb/fprintd gets the usb5 HUB information and continues its
>>   enumeration
>> 
>> As I see it, there may be two issues:
>> 1. userspace should not block due to the firmware load hanging
>> 2. btusb should give up more quickly when the device disappears
>> 
>> Does anyone have a good idea about the possible cause or how we can fix
>> the problem?
>> 
>> Downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2019857
> 
> I'm not familiar with the btusb driver, so someone on the 
> linux-bluetooth mailing list would have a better idea about this. 
> However, it does look as though btusb keeps the device locked during the 
> entire 10-second period while it tries to send over the firmware, and it 
> doesn't abort the procedure when it starts getting disconnection errors 
> but instead persists until a timeout expires.  Keeping the device locked 
> would certainly block lsusb.
> 
> In general, locking the device during a firmware upload seems like
> the right thing to do -- you don't want extraneous transfers from
> other processes messing up the firmware!  So overall, it appears that
> the whole problem would be solved if the firmware transfer were
> aborted as soon as the -ENODEV errors start appearing.

the problem seems to be that we hitting HCI command timeout. So the firmware download is done via HCI commands. These commands are send to the transport driver btusb.c via hdev->send (as btusb_send_frame). This triggers the usb_submit_urb or queues them via data->deferred anchor. All this reports back the error properly except that nobody does anything with it.

See hci_send_frame() last portion:

        err = hdev->send(hdev, skb);                                             
        if (err < 0) {                                                           
                bt_dev_err(hdev, "sending frame failed (%d)", err);              
                kfree_skb(skb);                                                  
        }

And that is it. We are not checking for ENODEV or any error here. That means the failure of the HCI command gets only caught via the HCI command timeout. I don’t know how to do this yet, but you would have to look there to fail HCI command right away instead of waiting for the timeout.

Regards

Marcel




[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux