Hi Dean, > This is my 3rd patchset in my series of resolving a design flaw in > hci_uart_tty_close() related to the proper closure of the Data Link protocol > layer which can result in a kernel crash. > > This patchset contains a single patch that adds rwlocking around the "proto" > function pointers. This provides a full fix to prevent a potential kernel crash > triggered by hu->proto->close() running before or during the execution of a > non-close "proto" function pointer. > > > Previous Discussions > ==================== > > Please refer to the discussion on my 2nd patchset V1 that can be found here: > https://www.spinics.net/lists/linux-bluetooth/msg70370.html > You accepted the 3 patches of my patchset of commits namely: > Bluetooth: hci_ldisc: Add protocol check to hci_uart_send_frame() > Bluetooth: hci_ldisc: Add protocol check to hci_uart_dequeue() > Bluetooth: hci_ldisc: Add protocol check to hci_uart_tx_wakeup() > > drivers/bluetooth/hci_ldisc.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > Please refer to the discussion on my 1st patchset V3 that can be found here: > https://www.spinics.net/lists/linux-bluetooth/msg70315.html > You accepted the 3 patches of my patchset of commits namely: > Bluetooth: hci_ldisc: Add missing return in hci_uart_init_work() > Bluetooth: hci_ldisc: Ensure hu->hdev set to NULL before freeing hdev > Bluetooth: hci_ldisc: Add missing clear HCI_UART_PROTO_READY > > drivers/bluetooth/hci_ldisc.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > Please refer to the discussion on my 1st patchset V2 that can be found here: > https://www.spinics.net/lists/linux-bluetooth/msg70183.html > > Please refer to the discussion on my 1st RFC patchset V1 that can be found here: > https://www.spinics.net/lists/linux-bluetooth/msg70077.html > > > Analysis > ======== > > There are multiple threads that use "proto" function pointers that potentially > can be running concurrently with the thread of hci_uart_tty_close(). > > The threads include (not a complete list): > > a) hci_uart_write_work() [hci_ldisc.c] > which is executed by the work queue hu->write_work which creates the > callstack: > hci_uart_write_work() > hci_uart_dequeue() > hu->proto->dequeue() > > b) hci_tx_work() [hci_core.c] > which is executed by the work queue hdev->tx_work which creates the > callstack: > hci_tx_work() > hci_send_frame() > hdev->send() > hci_uart_send_frame() > hu->proto->enqueue() > then calls > hci_uart_tx_wakeup() > schedule_work(&hu->write_work) > triggering a) > > c) hci_cmd_work() [hci_core.c] > which is executed by the work queue hdev->cmd_work which creates the > callstack: > hci_cmd_work() > hci_send_frame() > hdev->send() > hci_uart_send_frame() > hu->proto->enqueue() > then calls > hci_uart_tx_wakeup() > schedule_work(&hu->write_work) > triggering a) > > d) hci_dev_close() [hci_core.c] > creates the callstack: > hci_dev_close() > hci_dev_do_close() > hdev->flush() > hci_uart_flush() > hu->proto->flush() > > e) When BCSP is used and a retransmission is needed: > bcsp_timed_event() [hci_bcsp.c] > creates the callstack: > bcsp_timed_event() > hci_uart_tx_wakeup() > schedule_work(&hu->write_work) > triggering a) > > > hci_ldisc.c: hci_uart_tty_close() currently has the construction: > > cancel_work_sync(&hu->write_work); > > if (test_and_clear_bit(HCI_UART_PROTO_READY, &hu->flags)) { > if (hdev) { > if (test_bit(HCI_UART_REGISTERED, &hu->flags)) > hci_unregister_dev(hdev); > hci_free_dev(hdev); > } > hu->proto->close(hu); > } > > At first glance, the cancel_work_sync(&hu->write_work) would seem to inhibit > hci_uart_write_work() from running. Actually, the call to cancel_work_sync() > can be ineffective because: > > a) hci_unregister_dev() calls hci_dev_do_close() and if HCI_UP flag is in the > set state then it is possible for hci_dev_do_close() to send a HCI command. > Such as sending a HCI RESET command which causes hci_send_frame() to schedule > hci_uart_write_work() after the cancel_work_sync() has executed. > > b) In the case of BCSP retransmissions, the callstack in e) runs periodically > every 250ms. This can occur before and after the cancel_work_sync() executed. > > > The call to hu->proto->close(hu) is partially protected by the flag > HCI_UART_PROTO_READY but there is a small race condition. A race condition > exists because hci_uart_tty_close() can clear HCI_UART_PROTO_READY just after > the concurrent thread(s) detected the flag as set. The race potentially allows > the concurrent thread(s) to run a "proto" function pointer after or during > hci_uart_tty_close() is calling hu->proto->close(hu) which could trigger a > kernel crash. > > Bear in mind that on systems with low resources, such as embedded systems, that > threads can get delayed by interrupts and pre-emption. This means that a > concurrent thread might sleep after it checks the HCI_UART_PROTO_READY flag. > Therefore, just using the flag HCI_UART_PROTO_READY is not 100% safe from > introducing race conditions. > > Note that the elapsed run-time for hci_unregister_dev() is dependent on a > number of scenarios. This means hci_unregister_dev() can quickly return or take > up to 10 seconds (guess) to return. This means that the elapsed time between > clearing HCI_UART_PROTO_READY and executing hu->proto->close(hu) is variable > and therefore there is a variable risk in triggering the kernel crash. > > > Solution implemented by this patchset > ===================================== > > Modify hci_ldisc.c: hci_uart_tty_close() to use a rwlock solution: > > cancel_work_sync(&hu->write_work); > > if (test_bit(HCI_UART_PROTO_READY, &hu->flags)) { > write_lock_irqsave(&hu->proto_lock, flags); > clear_bit(HCI_UART_PROTO_READY, &hu->flags); > write_unlock_irqrestore(&hu->proto_lock, flags); > > if (hdev) { > if (test_bit(HCI_UART_REGISTERED, &hu->flags)) > hci_unregister_dev(hdev); > hci_free_dev(hdev); > } > hu->proto->close(hu); > } > > The call to write_lock_irqsave(&hu->proto_lock, flags) will block when a > concurrent thread has called read_lock(&hu->proto_lock). This ensures that > the flag HCI_UART_PROTO_READY is only cleared when no "proto" function pointers > are running in the concurrent threads. Therefore, it is not possible for > any "proto" function pointers to be running immediately before, during or after > hu->proto->close(hu) is called. This means the race condition is avoided. > > Note that this solution will inhibit the Data Link layer protocol from > independently retransmitting because hu->proto->enqueue() cannot be called after > HCI_UART_PROTO_READY has been put into the clear state. > > > Test results from running the patchset > ====================================== > > A x86 64-bit test kernel was created based on kernel v4.11.0 plus the 7 patches: > > 3rd Patchset > Bluetooth: hci_ldisc: Use rwlocking to avoid closing proto races > 2nd Patchset > Bluetooth: hci_ldisc: Add protocol check to hci_uart_send_frame() > Bluetooth: hci_ldisc: Add protocol check to hci_uart_dequeue() > Bluetooth: hci_ldisc: Add protocol check to hci_uart_tx_wakeup() > 1st Patchset > Bluetooth: hci_ldisc: Add missing return in hci_uart_init_work() > Bluetooth: hci_ldisc: Ensure hu->hdev set to NULL before freeing hdev > Bluetooth: hci_ldisc: Add missing clear HCI_UART_PROTO_READY > > The test used btattach with BCSP which fails to establish a link because BCSP > is not fully implemented in the kernel. However, this is ideal in forcing > BCSP retransmissions. btattach was killed after 5 seconds of running to force > hci_uart_tty_close() to execute. > > Here is a snippet of a dmesg log with dynamic debug enabled: > [ 1039.415914] hci_uart_tty_close: tty ffff9861c691d800 > [ 1039.415917] hci_uart_close: hdev ffff98625b422000 > [ 1039.415920] hci_uart_flush: hdev ffff98625b422000 tty ffff9861c691d800 > [ 1039.415922] bcsp_flush: hu ffff9861c27bd480 > [ 1039.415928] hci_unregister_dev: ffff98625b422000 name hci1 bus 3 > [ 1039.418767] bcsp_timed_event: hu ffff9861c27bd480 retransmitting 2 pkts > [ 1039.674810] Bluetooth: hci1 command 0x1001 tx timeout > [ 1039.674827] hci_cmd_work: hci1 cmd_cnt 1 cmd queued 1 > [ 1039.674832] hci_send_frame: hci1 type 1 len 3 > [ 1039.674836] hci_uart_send_frame: hci1: type 1 len 3 > [ 1039.674838] Bluetooth: hci1 sending frame failed (-49) > [ 1041.722773] Bluetooth: hci1 command 0x1009 tx timeout > [ 1041.722811] hci_cmd_work: hci1 cmd_cnt 1 cmd queued 0 > [ 1045.818847] hci_uart_close: hdev ffff98625b422000 > [ 1045.818850] hci_uart_flush: hdev ffff98625b422000 tty ffff9861c691d800 > [ 1045.818868] hci_dev_do_close: hci1 ffff98625b422000 > [ 1045.819031] hci_conn_params_clear_all: All LE connection parameters were removed > [ 1045.819036] bcsp_close: hu ffff9861c27bd480 > > The following can be observed from the results: > > 1. hci_unregister_dev() ran which means HCI_UART_PROTO_READY has been cleared. > > 2. bcsp_timed_event() runs after HCI_UART_PROTO_READY has been cleared and > so hci_uart_tx_wakeup() no longer appears in the log. This inhibits further > BCSP retransmission attempts. > > 3. Bluetooth: hci1 command 0x1001 tx timeout > This occurs because BCSP was unable to successfully transfer the HCI > command. BCSP fails because the data plane to and from the UART driver is no > longer working. > > 4. Bluetooth: hci1 sending frame failed (-49) > This occurs because HCI_UART_PROTO_READY has been cleared which inhibits > a new HCI command being sent from hci_cmd_work(). > > 5. Bluetooth: hci1 command 0x1009 tx timeout > BCSP never took the HCI command but the HCI core still performs the HCI > command timeout. This means the thread gets blocked for 2 seconds for doing > nothing. > > 6. hci_uart_close() runs twice which seems to be unnecessary. This causes > hci_uart_flush() to run twice but notice the 2nd execution of > hci_uart_flush() does not run bcsp_flush() because hdev->flush gets set to > NULL in hci_uart_close(). This suggests a workaround was done in the past > to prevent a kernel crash. > > > Outline of potential further work > ================================= > > With this patchset, the design flaw in hci_uart_tty_close() is further exposed. > > It can be seen that hci_uart_tty_close() is systematically preventing > communications with the Bluetooth Radio Module. But hci_uart_tty_close() calls > hci_unregister_dev() which calls hci_dev_do_close(). If the flag HCI_UP is in > the set state then hci_dev_do_close() assumes that good communications with the > Bluetooth Radio Module are still possible. Therefore, there is a conflict in the > design because the design should not be closing the link and trying to use the > link to send HCI commands at the same time. This can trigger the protocol error > handling procedures to occur which are unsuccessful. > > In addition, it is suspected that before hci_uart_tty_close() runs, the TTY > layer has cut the data plane between the Data Link protocol and the UART driver. > This means that if hci_uart_tty_close() is rearranged to allow > HCI_UART_PROTO_READY to be cleared after hci_unregister_dev() runs then any HCI > command will still fail to reach the Bluetooth Radio Module. > > There are other issues in hci_ldisc.c: hci_uart_tty_close() > > /* Detach from the tty */ > tty->disc_data = NULL; > > if (!hu) > return; > > hdev = hu->hdev; > if (hdev) > hci_uart_close(hdev); > > It can be seen that tty->disc_data is set to NULL. This causes some of the HCI > UART LDISC functions to not perform any operation such as the functions: > > hci_uart_tty_wakeup() > hci_uart_tty_receive() > hci_uart_tty_ioctl() > > Doing "tty->disc_data = NULL" sabotages the communications with the Bluetooth > Radio Module. Therefore, this statement needs to be moved to after the > communications are going to be cleanly closed such as after hci_unregister_dev() > has completed. > > Calling hci_uart_close(hdev) is likely to interfere with the communications with > the Bluetooth Radio Module. Therefore, this statement needs to be moved to after > the communications are going to be cleanly closed such as after > hci_unregister_dev() has completed. > > > Potential solution to the design flaw > ===================================== > > Suspect that a solution would be to get hci_uart_tty_close() to run > hci_unregister_dev() before tearing down the resources needed by the > Data Link protocol layer. > > > The patch was rebased onto the bluetooth-next master branch HEAD commit: > 2da711a Bluetooth: Skip vendor diagnostic configuration for HCI User Channel > > Dean Jenkins (1): > Bluetooth: hci_ldisc: Use rwlocking to avoid closing proto races > > drivers/bluetooth/hci_ldisc.c | 40 +++++++++++++++++++++++++++++++++++----- > drivers/bluetooth/hci_uart.h | 1 + > 2 files changed, 36 insertions(+), 5 deletions(-) patch has been applied to bluetooth-next tree. Regards Marcel -- To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html