hidp_get_raw_report source of 5 second delay in device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear friends

Any thoughts on my two patches:
- [PATCH] hidp: Make hidp_get_raw_report abort if the session is
terminating
- RFC [PATCH] hci: Fix race between hidp_session and hci code

originally sent 29th Jan and resent 18th Feb?

There has been no reply to this list ...

The first patch is critical, the second makes everything neat and tidy.

The following kernel log, using linux 3.8-rc6 and DEBUG turned on in the
bluetooth & hidp modules illustrate the problem if neither of these
patches are used. I added a couple of extra printks in the
hidp_session() routine to show its progress. I also
tweaked hidp_session to avoid a possible kernel OOPS caused by this same
problem (which is avoided if the first patch was applied).
Lines with *** are my annotations to describe what is happening.

*** Kernel resuming - does usb reset as part of resume from suspend sequence ...
Feb 20 09:31:15 laptop2 kernel: [  334.204082] usb 3-2: reset full-speed USB device number 2 using uhci_hcd
Feb 20 09:31:15 laptop2 kernel: [  334.375693] btusb 3-2:1.0: no reset_resume for driver btusb?
Feb 20 09:31:15 laptop2 kernel: [  334.375697] btusb 3-2:1.1: no reset_resume for driver btusb?
Feb 20 09:31:15 laptop2 kernel: [  334.375710] hci_unregister_dev: f6119000 name hci0 bus 1
Feb 20 09:31:15 laptop2 kernel: [  334.375712] hci_dev_do_close: hci0 f6119000
Feb 20 09:31:15 laptop2 kernel: [  334.375716] hci_req_cancel: hci0 err 0x13
Feb 20 09:31:15 laptop2 kernel: [  334.375719] hci_conn_hash_flush: hdev hci0
Feb 20 09:31:15 laptop2 kernel: [  334.375722] l2cap_disconn_cfm: hcon e000bc00 reason 22
Feb 20 09:31:15 laptop2 kernel: [  334.375725] l2cap_conn_del: hcon e000bc00 conn f3294800, err 103
Feb 20 09:31:15 laptop2 kernel: [  334.375727] l2cap_chan_hold: chan e000a400 orig refcnt 3
Feb 20 09:31:15 laptop2 kernel: [  334.375730] l2cap_chan_del: chan e000a400, conn f3294800, err 103
Feb 20 09:31:15 laptop2 kernel: [  334.375732] l2cap_chan_put: chan e000a400 orig refcnt 4
Feb 20 09:31:15 laptop2 kernel: [  334.375734] hci_conn_put: hcon e000bc00 orig refcnt 2
Feb 20 09:31:15 laptop2 kernel: [  334.375742] l2cap_chan_put: chan e000a400 orig refcnt 3
Feb 20 09:31:15 laptop2 kernel: [  334.375744] l2cap_chan_hold: chan e0009400 orig refcnt 3
Feb 20 09:31:15 laptop2 kernel: [  334.375746] l2cap_chan_del: chan e0009400, conn f3294800, err 103
Feb 20 09:31:15 laptop2 kernel: [  334.375748] l2cap_chan_put: chan e0009400 orig refcnt 4
Feb 20 09:31:15 laptop2 kernel: [  334.375750] hci_conn_put: hcon e000bc00 orig refcnt 1
Feb 20 09:31:15 laptop2 kernel: [  334.375754] l2cap_chan_put: chan e0009400 orig refcnt 3
Feb 20 09:31:15 laptop2 kernel: [  334.375756] hci_chan_del: hci0 hcon e000bc00 chan e00d7dc0
*** the l2cap_disonn_cfm/l2cap_chan_del above triggers the hidp_session
*** thread, so hidp_session can now fall out its main loop.
*** hidp_session then does hid_destroy_device(), which through code in the power_supply
*** part of the kernel invokes a battery status lookup which requires a call to hidp_get_raw_report
Feb 20 09:31:15 laptop2 kernel: [  334.375765] hidp_session Out of main loop
Feb 20 09:31:15 laptop2 kernel: [  334.375767] hidp_session About to do hid_destroy_device
Feb 20 09:31:15 laptop2 kernel: [  334.375857] hidp_get_raw_report: Entering hidp_get_raw_report
Feb 20 09:31:15 laptop2 kernel: [  334.375860] __hidp_send_ctrl_message: session f3294f00 data e0115d36 size 1
*** hidp_get_raw_report sends a ctrl message and then waits. Meanwhile the main kernel gets
*** on with running the hci driver code ...
Feb 20 09:31:15 laptop2 kernel: [  334.376725] hci_conn_put: hcon e000bc00 orig refcnt 0
Feb 20 09:31:15 laptop2 kernel: [  334.376729] hci_conn_del: hci0 hcon e000bc00 handle 46
Feb 20 09:31:15 laptop2 kernel: [  334.376732] hci_chan_list_flush: hcon e000bc00
Feb 20 09:31:15 laptop2 kernel: [  334.376832] hci_dev_put: hci0 orig refcnt 12
Feb 20 09:31:15 laptop2 kernel: [  334.376835] hci_sock_dev_event: hdev hci0 event 4
Feb 20 09:31:15 laptop2 kernel: [  334.376839] hci_send_to_sock: hdev   (null) len 8
Feb 20 09:31:15 laptop2 kernel: [  334.376850] hci_dev_put: hci0 orig refcnt 11
Feb 20 09:31:15 laptop2 kernel: [  334.376854] hci_send_to_control: len 6
Feb 20 09:31:15 laptop2 kernel: [  334.376856] hci_sock_dev_event: hdev hci0 event 2
Feb 20 09:31:15 laptop2 kernel: [  334.376859] hci_send_to_sock: hdev   (null) len 8
Feb 20 09:31:15 laptop2 kernel: [  334.376864] hci_dev_put: hci0 orig refcnt 10
Feb 20 09:31:15 laptop2 kernel: [  334.376910] hci_del_sysfs: f6119000 name hci0 bus 1
Feb 20 09:31:15 laptop2 kernel: [  334.376994] hci_dev_put: hci0 orig refcnt 5
*** the 'hci' device is the sysfs tree has now just been removed - yet the various devices
*** dependent on it have yet to be removed (input, event, hidraw, hci_conn)

*** Note the 5 SECOND gap ... thats the timeout value in hidp_get_raw_report!
*** having timed out, the hidp_session thread gets underway again
*** the various devices are now removed (input, event, hidraw, hci_conn)
*** and life can go on
Feb 20 09:31:15 laptop2 kernel: [  339.372148] hidp_get_raw_report timed out!
Feb 20 09:31:15 laptop2 kernel: [  339.372153] power_supply hid-60:c5:47:1b:07:b8-battery: driver failed to report `capacity' property: -5
Feb 20 09:31:15 laptop2 kernel: [  339.376347] PM: resume of devices complete after 5491.694 msecs
Feb 20 09:31:15 laptop2 kernel: [  339.376362] hidp_session Done hid_destry_device
Feb 20 09:31:15 laptop2 kernel: [  339.376366] hci_conn_del_sysfs: conn e000bc00
Feb 20 09:31:15 laptop2 kernel: [  339.376392] hci_dev_put: hci0 orig refcnt 1
*** a new hci device is registered, and a new hci_conn, and sub-tree of input devices
*** will be added, which is normal and to be expected.
*** NOTE though whilst the new registration occurs here in this log, I have seen it
*** occur much earlier in other logs (different versions of kernel), leading to
*** various problems when a new hci_conn is added whilst the old one has yet to be deleted.
Feb 20 09:31:15 laptop2 kernel: [  339.376424] hci_register_dev: f3367000 name hci0 bus 1
Feb 20 09:31:15 laptop2 kernel: [  339.376475] hci_add_sysfs: f3367000 name hci0 bus 1
Feb 20 09:31:15 laptop2 kernel: [  339.376561] hci_sock_dev_event: hdev hci0 event 1
Feb 20 09:31:15 laptop2 kernel: [  339.376564] hci_send_to_sock: hdev   (null) len 8
Feb 20 09:31:15 laptop2 kernel: [  339.376568] hci_dev_hold: hci0 orig refcnt 5
Feb 20 09:31:15 laptop2 kernel: [  339.378210] bt_sock_poll: sock ef627500, sk f30afe00
Feb 20 09:31:15 laptop2 kernel: [  339.384283] bt_sock_poll: sock f2d95380, sk f675a600
Feb 20 09:31:15 laptop2 kernel: [  339.384291] bt_sock_poll: sock f5be0c00, sk f673f600
Feb 20 09:31:15 laptop2 kernel: [  339.384295] bt_sock_poll: sock f5be0180, sk f600ac00
Feb 20 09:31:15 laptop2 kernel: [  339.384298] bt_sock_poll: sock f5b30180, sk f66c4400
Feb 20 09:31:15 laptop2 kernel: [  339.384301] bt_sock_poll: sock f5b30300, sk f6691e00
Feb 20 09:31:15 laptop2 kernel: [  339.384303] bt_sock_poll: sock f5b30480, sk f673e000
Feb 20 09:31:15 laptop2 kernel: [  339.384306] bt_sock_poll: sock f5b30600, sk f673e600
Feb 20 09:31:15 laptop2 kernel: [  339.384309] bt_sock_poll: sock f5b30780, sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.384311] bt_sock_poll: sock f5b30900, sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.384314] bt_sock_poll: sock f5b30a80, sk f675a800
Feb 20 09:31:15 laptop2 kernel: [  339.384320] bt_sock_poll: sock f2d94180, sk f66c4800
Feb 20 09:31:15 laptop2 kernel: [  339.384323] bt_sock_poll: sock f5b30000, sk f66c5200
Feb 20 09:31:15 laptop2 kernel: [  339.384325] bt_sock_poll: sock df8a8300, sk e968c800
Feb 20 09:31:15 laptop2 kernel: [  339.384328] bt_sock_poll: sock df8a8180, sk e968d600
Feb 20 09:31:15 laptop2 kernel: [  339.384357] hci_sock_recvmsg: sock f5be0780, sk f673ea00
Feb 20 09:31:15 laptop2 kernel: [  339.384574] l2cap_sock_release: sock df8a8300, sk e968c800
Feb 20 09:31:15 laptop2 kernel: [  339.384578] l2cap_sock_shutdown: sock df8a8300, sk e968c800
Feb 20 09:31:15 laptop2 kernel: [  339.384582] l2cap_chan_close: chan e000a400 state BT_CLOSED sk e968c800
Feb 20 09:31:15 laptop2 kernel: [  339.384586] l2cap_sock_kill: sk e968c800 state BT_CLOSED
Feb 20 09:31:15 laptop2 kernel: [  339.384589] l2cap_chan_put: chan e000a400 orig refcnt 2
Feb 20 09:31:15 laptop2 kernel: [  339.384592] l2cap_sock_destruct: sk e968c800
Feb 20 09:31:15 laptop2 kernel: [  339.384594] l2cap_chan_put: chan e000a400 orig refcnt 1
Feb 20 09:31:15 laptop2 kernel: [  339.384597] l2cap_chan_destroy: chan e000a400
Feb 20 09:31:15 laptop2 kernel: [  339.384656] l2cap_sock_release: sock df8a8180, sk e968d600
Feb 20 09:31:15 laptop2 kernel: [  339.384659] l2cap_sock_shutdown: sock df8a8180, sk e968d600
Feb 20 09:31:15 laptop2 kernel: [  339.384663] l2cap_chan_close: chan e0009400 state BT_CLOSED sk e968d600
Feb 20 09:31:15 laptop2 kernel: [  339.384666] l2cap_sock_kill: sk e968d600 state BT_CLOSED
Feb 20 09:31:15 laptop2 kernel: [  339.384668] l2cap_chan_put: chan e0009400 orig refcnt 2
Feb 20 09:31:15 laptop2 kernel: [  339.384670] l2cap_sock_destruct: sk e968d600
Feb 20 09:31:15 laptop2 kernel: [  339.384673] l2cap_chan_put: chan e0009400 orig refcnt 1
Feb 20 09:31:15 laptop2 kernel: [  339.384675] l2cap_chan_destroy: chan e0009400
Feb 20 09:31:15 laptop2 kernel: [  339.384707] bt_sock_poll: sock f2d95380, sk f675a600
Feb 20 09:31:15 laptop2 kernel: [  339.384710] bt_sock_poll: sock f5be0c00, sk f673f600
Feb 20 09:31:15 laptop2 kernel: [  339.384713] bt_sock_poll: sock f5be0180, sk f600ac00
Feb 20 09:31:15 laptop2 kernel: [  339.384715] bt_sock_poll: sock f5b30180, sk f66c4400
Feb 20 09:31:15 laptop2 kernel: [  339.384718] bt_sock_poll: sock f5b30300, sk f6691e00
Feb 20 09:31:15 laptop2 kernel: [  339.384720] bt_sock_poll: sock f5b30480, sk f673e000
Feb 20 09:31:15 laptop2 kernel: [  339.384722] bt_sock_poll: sock f5b30600, sk f673e600
Feb 20 09:31:15 laptop2 kernel: [  339.384725] bt_sock_poll: sock f5b30780, sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.384727] bt_sock_poll: sock f5b30900, sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.384729] bt_sock_poll: sock f5b30a80, sk f675a800
Feb 20 09:31:15 laptop2 kernel: [  339.384733] bt_sock_poll: sock f2d94180, sk f66c4800
Feb 20 09:31:15 laptop2 kernel: [  339.384735] bt_sock_poll: sock f5b30000, sk f66c5200
Feb 20 09:31:15 laptop2 kernel: [  339.384743] hci_sock_recvmsg: sock f5be0780, sk f673ea00
Feb 20 09:31:15 laptop2 kernel: [  339.384771] hci_sock_release: sock f5be0600 sk f673e800
Feb 20 09:31:15 laptop2 kernel: [  339.384829] hci_sock_create: sock df4b0000
Feb 20 09:31:15 laptop2 kernel: [  339.384838] hci_sock_ioctl: cmd 800448d2 arg b885cb00
Feb 20 09:31:15 laptop2 kernel: [  339.384843] hci_sock_release: sock df4b0000 sk f1b82600
Feb 20 09:31:15 laptop2 kernel: [  339.385167] bt_accept_dequeue: parent f675a800
Feb 20 09:31:15 laptop2 kernel: [  339.385191] l2cap_sock_release: sock f5b30900, sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385194] l2cap_sock_shutdown: sock f5b30900, sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385197] l2cap_chan_close: chan f303b800 state BT_LISTEN sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385200] l2cap_sock_cleanup_listen: parent f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385202] bt_accept_dequeue: parent f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385205] l2cap_sock_kill: sk f675bc00 state BT_CLOSED
Feb 20 09:31:15 laptop2 kernel: [  339.385208] l2cap_chan_put: chan f303b800 orig refcnt 2
Feb 20 09:31:15 laptop2 kernel: [  339.385210] l2cap_sock_destruct: sk f675bc00
Feb 20 09:31:15 laptop2 kernel: [  339.385213] l2cap_chan_put: chan f303b800 orig refcnt 1
Feb 20 09:31:15 laptop2 kernel: [  339.385215] l2cap_chan_destroy: chan f303b800
Feb 20 09:31:15 laptop2 kernel: [  339.385343] l2cap_sock_release: sock f5b30780, sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385346] l2cap_sock_shutdown: sock f5b30780, sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385350] l2cap_chan_close: chan f303b400 state BT_LISTEN sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385352] l2cap_sock_cleanup_listen: parent f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385354] bt_accept_dequeue: parent f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385357] l2cap_sock_kill: sk f675b800 state BT_CLOSED
Feb 20 09:31:15 laptop2 kernel: [  339.385360] l2cap_chan_put: chan f303b400 orig refcnt 2
Feb 20 09:31:15 laptop2 kernel: [  339.385362] l2cap_sock_destruct: sk f675b800
Feb 20 09:31:15 laptop2 kernel: [  339.385364] l2cap_chan_put: chan f303b400 orig refcnt 1
Feb 20 09:31:15 laptop2 kernel: [  339.385367] l2cap_chan_destroy: chan f303b400
Feb 20 09:31:15 laptop2 kernel: [  339.385386] bt_accept_dequeue: parent f673e600
Feb 20 09:31:15 laptop2 kernel: [  339.385397] bt_accept_dequeue: parent f6691e00
Feb 20 09:31:15 laptop2 kernel: [  339.385408] bt_accept_dequeue: parent f673e000
Feb 20 09:31:15 laptop2 kernel: [  339.394097] hci_rfkill_set_block: f3367000 name hci0 blocked 0
Feb 20 09:31:15 laptop2 kernel: [  339.394103] hci_power_on: hci0
Feb 20 09:31:15 laptop2 kernel: [  339.394106] hci_dev_get: 0
Feb 20 09:31:15 laptop2 kernel: [  339.394109] hci_dev_hold: hci0 orig refcnt 6
Feb 20 09:31:15 laptop2 kernel: [  339.394112] hci_dev_open: hci0 f3367000
Feb 20 09:31:15 laptop2 kernel: [  339.394136] __hci_request: hci0 start
Feb 20 09:31:15 laptop2 kernel: [  339.394139] hci_init_req: hci0 0
Feb 20 09:31:15 laptop2 kernel: [  339.394142] hci_reset_req: hci0 0
Feb 20 09:31:15 laptop2 kernel: [  339.394145] hci_send_cmd: hci0 opcode 0x0c03 plen 0
Feb 20 09:31:15 laptop2 kernel: [  339.394147] hci_send_cmd: skb len 3
Feb 20 09:31:15 laptop2 kernel: [  339.394155] hci_send_cmd: hci0 opcode 0x1003 plen 0
Feb 20 09:31:15 laptop2 kernel: [  339.394157] hci_send_cmd: skb len 3
Feb 20 09:31:15 laptop2 kernel: [  339.394160] hci_send_cmd: hci0 opcode 0x1001 plen 0
Feb 20 09:31:15 laptop2 kernel: [  339.394162] hci_send_cmd: skb len 3
Feb 20 09:31:15 laptop2 kernel: [  339.394168] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 3
Feb 20 09:31:15 laptop2 kernel: [  339.394172] hci_send_frame: hci0 type 1 len 3
Feb 20 09:31:15 laptop2 kernel: [  339.395843] hci_rx_work: hci0
Feb 20 09:31:15 laptop2 kernel: [  339.395850] hci_rx_work: hci0 Event packet
Feb 20 09:31:15 laptop2 kernel: [  339.395854] hci_cmd_status_evt: hci0 opcode 0x0000
Feb 20 09:31:15 laptop2 kernel: [  339.376989] Restarting tasks ... done.

The result of the above is the following from udevadm monitor (i.e. the kernel udev events):
KERNEL - the kernel uevent
KERNEL[339.393586] remove   /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/rfkill0 (rfkill)
KERNEL[339.393605] remove   /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0 (bluetooth)
KERNEL[339.393627] change   /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[339.393645] remove   /hci0/hci0:46/input10/event10 (input)
KERNEL[339.393673] remove   /hci0/hci0:46/input10 (input)
KERNEL[339.393692] remove   /hci0/hci0:46/0005:05AC:0256.0001/hidraw/hidraw0 (hidraw)
KERNEL[339.393711] remove   /hci0/hci0:46/0005:05AC:0256.0001 (hid)
KERNEL[339.393727] remove   /hci0/hci0:46 (bluetooth)
KERNEL[339.393743] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0 (bluetooth)
KERNEL[339.393761] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/rfkill2 (rfkill)
KERNEL[345.114360] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/hci0:46 (bluetooth)
KERNEL[345.384580] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/hci0:46/0005:05AC:0256.0002 (hid)
KERNEL[345.424395] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/hci0:46/input11 (input)
KERNEL[345.425001] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/hci0:46/input11/event10 (input)
KERNEL[345.425312] add      /devices/pci0000:00/0000:00:1a.0/usb3/3-2/3-2:1.0/bluetooth/hci0/hci0:46/0005:05AC:0256.0002/hidraw/hidraw0 (hidraw)

Note the lines starting /hci0/hci0:46 ...

These lines are incorrect - the sysfs path is truncated, caused by device removal in the wrong order.
This can play havoc with userspace tools (like Xorg input device driver) that depends on these events.

The two patches posted resolve this. The first by avoiding the 5 second
timeout penalty (a trival fix). The second by delaying hci removal until
hci_conn references to it have all gone.

The bluetooth kernel code in its current form is problematic - hence all
kind of bug reports by users against their distributions.

Regards
Karl

--
To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux