Hi Olivier, On Tue, Sep 5, 2023 at 1:42 PM Luiz Augusto von Dentz <luiz.dentz@xxxxxxxxx> wrote: > > Hi Olivier, > > On Mon, Sep 4, 2023 at 3:12 PM Olivier L'Heureux > <olivier.lheureux@xxxxxxx> wrote: > > > > Request for Comments > > > > Summary > > ======= > > > > We have found a memory leak in the Bluetooth L2CAP subsystem and fixed > > it, but this opened the doors to use-after-free problems, which are > > not completely fixed yet. This patch series present a way to reproduce > > it, the proposed fix and the status. There is more detailed > > documentation in [1]. > > > > Memory Leak Overview > > ==================== > > > > We have found a memory leak in the L2CAP layer of Linux's Bluetooth > > Subsystem, the same as reported by syzbot in "[syzbot] [bluetooth?] > > memory leak in hci_conn_add (2)" (2023-09-02 23:25:00 -0700) [19]. > > > > We can reproduce it. When, in a loop, a user-mode program: > > > > 1. Opens a Bluetooth socket at the L2CAP layer: > > > > sockfd = socket(AF_BLUETOOTH, SOCK_SEQPACKET, BTPROTO_L2CAP); > > > > 2. Set a timeout on the socket: > > > > timeout.tv_sec = 2; > > timeout.tv_usec = 0; > > > > err = setsockopt(sockfd, SOL_SOCKET, SO_SNDTIMEO, &timeout, sizeof(timeout)); > > > > 3. Connect to a specific Bluetooth address: > > > > struct sockaddr_l2 ble_addr; > > > > memset(&ble_addr, 0, sizeof(ble_addr)); > > ble_addr.l2_family = AF_BLUETOOTH; > > ble_addr.l2_psm = htobs(0x80 /* L2CAP_PSM_LE_DYN_START */); > > // l2_bdaddr=00:0a:07:a3:22:00 > > ble_addr.l2_bdaddr = *(bdaddr_t*)("\x00\x0a\x07\xa3\x22\x00"); > > ble_addr.l2_cid = htobs(0); > > ble_addr.l2_bdaddr_type = BDADDR_LE_PUBLIC; > > > > err = connect(sockfd, (struct sockaddr*)&ble_addr, sizeof(ble_addr)); > > > > and when the user program does those three steps in a loop, then the > > kernel leaks the "struct l2cap_conn" [5] and "struct hci_conn" [6] > > objects allocated in "l2cap_conn_add()" [2] and "hci_conn_add()" [3]. > > Those objects are never freed. > > > > The "ble-memleak-repro" program reproduces the memory leak, if the > > kernel is not patched. Its source is in > > "package/ble-memleak-repro/ble-memleak-repro.c" [18]. > > We should probably create a test in l2cap-tester using SO_SNDTIMEO > first, so we can make sure CI test such case and we are able to detect > if the problem is reintroduced later: > > https://github.com/bluez/bluez/blob/master/tools/l2cap-tester.c > > > Memory Leak Fix > > =============== > > > > We have fixed the memory leak, see the patch series in > > "patches/linux/": > > > > 1. The first patch > > "0001-ARM-dts-stm32-Add-Bluetooth-usart2-feature-on-stm32m.patch" [11] > > enables Bluetooth on the DK2. > > This above should probably be sent separately. > > > 2. The second patch > > "0002-Bluetooth-add-many-traces-for-allocation-free-refcou.patch" [12] > > adds many dynamic debug traces that help understand the kernel > > behaviour and freeing problems. > > I'm fine with this change, but we better use the likes of bt_dev_dbg > instead of BT_DBG. > > > 3. Patches > > "0003-Bluetooth-L2CAP-use-refcount-on-struct-l2cap_chan-co.patch" [13] > > and > > "0004-Bluetooth-L2CAP-free-the-leaking-struct-l2cap_conn.patch" [14] > > fix the memory leak. > > 4. Patches > > "0005-Bluetooth-introduce-hci_conn_free-for-better-structu.patch" [15], > > "0006-Bluetooth-L2CAP-inc-refcount-if-reuse-struct-l2cap_c.patch" [16] > > and > > "0007-Bluetooth-unlink-objects-to-avoid-use-after-free.patch" [17] > > fixes the use-after-free that appears when the "struct l2cap_conn" > > [5] and "struct hci_conn" [6] objects are freed. > > These I'm not very comfortable applying as they are, I'm afraid there > could be regressions if they are not accompanied with proper tests, > besides I think there are less intrusive ways to cleanup l2cap_conn, > see below. > > > The commit messages explain why the commit is needed, which problem > > the commit solves, and how. > > > > The first and second patches are present for the memory leak > > reproduction only, they should not be part of a final fix. > > > > Patch Status > > ============ > > > > As of writing, the memory leak is fixed. The fix opened the door to > > other problems, especially use-after-free, sometimes followed by > > crashes due to NULL dereferences. We think there are weak references > > (i.e. pointers that don't increment the refcounter) previously > > pointing to memory that was never freed, but now is freed. On kernels > > v5.13 and v5.15, patches 0005, 0006 and 0007 seem to fix the > > use-after-free and NULL dereferences, but not on kernel v6.5 yet. > > I think the problem is that the lifetime of l2cap_conn shall be hooked > with hci_chan, but the likes of hci_chan_list_flush -> hci_chan_del > don't actually trigger l2cap_conn_del, which imo is the culprit here, > because connect_cfm/disconnect_cfm is not called when the connection > procedure has been aborted prematurely, so perhaps we shall get rid of > the likes of l2cap_connect_cfm/l2cap_disconn_cfm and instead do the > cleanup with in the following order: > hci_conn_cleanup->hci_chan_list_flush->hci_chan_del->l2cap_conn_del, > that way we avoid having multiple code paths attempting to cleanup > objects associated with hci_conn/hci_chan. > > I'd change hci_chan_create to take a del callback to avoid having > circular dependencies on one another though. Are you still planning to work on these changes, or perhaps you need more guidance regarding the suggested modifications? > > Reproducing with Buildroot > > ========================== > > > > We have reproduced and fixed the memory leak with Buildroot [7] and a > > "ble-memleak-repro" test application on an ST's Discovery Kit DK2 [4]. > > > > The "ble-memleak-repro" repository [1] contains the sources of a > > complete external Buildroot customisation [8], with the patches, a > > README, and more explanations to reproduce the problem with Buildroot > > on a DK2. > > > > References: > > =========== > > > > - [1]: <https://gitlab.com/essensium-mind/ble-memleak-repro.git> > > "ble-memleak-repro repository" > > > > - [2]: <https://elixir.bootlin.com/linux/v6.5/source/net/bluetooth/l2cap_core.c#L7833> > > "l2cap_conn_add()" > > > > - [3]: <https://elixir.bootlin.com/linux/v6.5/source/net/bluetooth/hci_conn.c#L986> > > "hci_conn_add()" > > > > - [4]: <https://www.st.com/content/st_com/en/products/evaluation-tools/product-evaluation-tools/mcu-mpu-eval-tools/stm32-mcu-mpu-eval-tools/stm32-discovery-kits/stm32mp157c-dk2.html> > > "STM32MP157C-DK2 (DK2)" > > > > - [5]: <https://elixir.bootlin.com/linux/v6.5/source/include/net/bluetooth/l2cap.h#L674> > > "struct l2cap_conn" > > > > - [6]: <https://elixir.bootlin.com/linux/v6.5/source/include/net/bluetooth/hci_core.h#L685> > > "struct hci_conn" > > > > - [7]: <https://buildroot.org/> > > "Buildroot is a tool to generate embedded Linux systems" > > > > - [8]: <https://buildroot.org/downloads/manual/manual.html#outside-br-custom> > > "9.2. Keeping customizations outside of Buildroot" > > > > - [11]: <patches/linux/0001-ARM-dts-stm32-Add-Bluetooth-usart2-feature-on-stm32m.patch> > > "0001-ARM-dts-stm32-Add-Bluetooth-usart2-feature-on-stm32m.patch" > > > > - [12]: <patches/linux/0002-Bluetooth-add-many-traces-for-allocation-free-refcou.patch> > > "0002-Bluetooth-add-many-traces-for-allocation-free-refcou.patch" > > > > - [13]: <patches/linux/0003-Bluetooth-L2CAP-use-refcount-on-struct-l2cap_chan-co.patch> > > "0003-Bluetooth-L2CAP-use-refcount-on-struct-l2cap_chan-co.patch" > > > > - [14]: <patches/linux/0004-Bluetooth-L2CAP-free-the-leaking-struct-l2cap_conn.patch> > > "0004-Bluetooth-L2CAP-free-the-leaking-struct-l2cap_conn.patch" > > > > - [15]: <patches/linux/0005-Bluetooth-introduce-hci_conn_free-for-better-structu.patch> > > "0005-Bluetooth-introduce-hci_conn_free-for-better-structu.patch" > > > > - [16]: <patches/linux/0006-Bluetooth-L2CAP-inc-refcount-if-reuse-struct-l2cap_c.patch> > > "0006-Bluetooth-L2CAP-inc-refcount-if-reuse-struct-l2cap_c.patch" > > > > - [17]: <patches/linux/0007-Bluetooth-unlink-objects-to-avoid-use-after-free.patch> > > "0007-Bluetooth-unlink-objects-to-avoid-use-after-free.patch" > > > > - [18]: <https://gitlab.com/essensium-mind/ble-memleak-repro/-/blob/main/package/ble-memleak-repro/ble-memleak-repro.c?ref_type=heads> > > "ble-memleak-repro.c" > > > > - [19]: <https://lore.kernel.org/linux-bluetooth/0000000000000fd01206046e7410@xxxxxxxxxx/T/#u> > > "[syzbot] [bluetooth?] memory leak in hci_conn_add (2)" > > 2023-09-02 23:25:00 -0700 > > > > --- > > > > Christophe Roullier (1): > > ARM: dts: stm32: Add Bluetooth (usart2) feature on stm32mp157x > > > > Olivier L'Heureux (6): > > Bluetooth: add many traces for allocation/free/refcounting > > Bluetooth: L2CAP: use refcount on struct l2cap_chan->conn > > Bluetooth: L2CAP: free the leaking struct l2cap_conn > > Bluetooth: introduce hci_conn_free() for better structure > > Bluetooth: L2CAP: inc refcount if reuse struct l2cap_conn > > Bluetooth: unlink objects to avoid use-after-free > > > > arch/arm/boot/dts/st/stm32mp157c-dk2.dts | 11 ++++++- > > include/net/bluetooth/hci_core.h | 7 +++-- > > net/bluetooth/hci_conn.c | 18 ++++++++++++ > > net/bluetooth/hci_sysfs.c | 8 ++++- > > net/bluetooth/l2cap_core.c | 37 ++++++++++++++++++++---- > > 5 files changed, 72 insertions(+), 9 deletions(-) > > > > > > base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c > > -- > > 2.39.2 > > > > > -- > Luiz Augusto von Dentz -- Luiz Augusto von Dentz