Re: [PATCH RFC 0/7] Fix a memory leak in Bluetooth L2CAP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Olivier,

On Wed, Sep 13, 2023 at 3:25 PM Olivier L'Heureux
<olivier.lheureux@xxxxxxx> wrote:
>
> Hello Luiz,
>
> Thanks for your review.
>
> On 05/09/2023 22:42, Luiz Augusto von Dentz wrote:
> > Hi Olivier,
> >
> > On Mon, Sep 4, 2023 at 3:12 PM Olivier L'Heureux
> > <olivier.lheureux@xxxxxxx> wrote:
> >>
> >> Request for Comments
> [...]
> >>
> >> The "ble-memleak-repro" program reproduces the memory leak, if the
> >> kernel is not patched. Its source is in
> >> "package/ble-memleak-repro/ble-memleak-repro.c" [18].
> >
> > We should probably create a test in l2cap-tester using SO_SNDTIMEO
> > first, so we can make sure CI test such case and we are able to detect
> > if the problem is reintroduced later:
> >
> > https://github.com/bluez/bluez/blob/master/tools/l2cap-tester.c
>
> I didn't know about "l2cap-tester.c". Agree, it would be great to
> integrate my test app into it. I could try, but I don't know the test
> framework yet.
>
> >> Memory Leak Fix
> >> ===============
> >>
> >> We have fixed the memory leak, see the patch series in
> >> "patches/linux/":
> >>
> >>   1. The first patch
> >>      "0001-ARM-dts-stm32-Add-Bluetooth-usart2-feature-on-stm32m.patch" [11]
> >>      enables Bluetooth on the DK2.
> >
> > This above should probably be sent separately.
> >
> >>   2. The second patch
> >>      "0002-Bluetooth-add-many-traces-for-allocation-free-refcou.patch" [12]
> >>      adds many dynamic debug traces that help understand the kernel
> >>      behaviour and freeing problems.
> >
> > I'm fine with this change, but we better use the likes of bt_dev_dbg
> > instead of BT_DBG.
>
> This commit intended to increase the visibility during debugging, and
> I was not intending it for a permanent presence in the kernel.
> But if you find it useful, I can submit a patch RFC v2 with
> "bt_dev_dbg()" instead of "BT_DBG()". Note that there is currently no
> "bt_dev_dbg()" in "l2cap_core.c" yet.
>
> >>      "0003-Bluetooth-L2CAP-use-refcount-on-struct-l2cap_chan-co.patch" [13]
> >>      and
> >>      "0004-Bluetooth-L2CAP-free-the-leaking-struct-l2cap_conn.patch" [14]
> >>      fix the memory leak.
> >>   4. Patches
> >>      "0005-Bluetooth-introduce-hci_conn_free-for-better-structu.patch" [15],
> >>      "0006-Bluetooth-L2CAP-inc-refcount-if-reuse-struct-l2cap_c.patch" [16]
> >>      and
> >>      "0007-Bluetooth-unlink-objects-to-avoid-use-after-free.patch" [17]
> >>      fixes the use-after-free that appears when the "struct l2cap_conn"
> >>      [5] and "struct hci_conn" [6] objects are freed.
> >
> > These I'm not very comfortable applying as they are, I'm afraid there
> > could be regressions if they are not accompanied with proper tests,
> > besides I think there are less intrusive ways to cleanup l2cap_conn,
> > see below.
> >
> >> The commit messages explain why the commit is needed, which problem
> >> the commit solves, and how.
> >>
> >> The first and second patches are present for the memory leak
> >> reproduction only, they should not be part of a final fix.
> >>
> >> Patch Status
> >> ============
> >>
> >> As of writing, the memory leak is fixed. The fix opened the door to
> >> other problems, especially use-after-free, sometimes followed by
> >> crashes due to NULL dereferences. We think there are weak references
> >> (i.e. pointers that don't increment the refcounter) previously
> >> pointing to memory that was never freed, but now is freed. On kernels
> >> v5.13 and v5.15, patches 0005, 0006 and 0007 seem to fix the
> >> use-after-free and NULL dereferences, but not on kernel v6.5 yet.
> >
> > I think the problem is that the lifetime of l2cap_conn shall be hooked
> > with hci_chan, but the likes of hci_chan_list_flush -> hci_chan_del
> > don't actually trigger l2cap_conn_del, which imo is the culprit here,
> > because connect_cfm/disconnect_cfm is not called when the connection
> > procedure has been aborted prematurely, so perhaps we shall get rid of
> > the likes of l2cap_connect_cfm/l2cap_disconn_cfm and instead do the
> > cleanup with in the following order:
> > hci_conn_cleanup->hci_chan_list_flush->hci_chan_del->l2cap_conn_del,
> > that way we avoid having multiple code paths attempting to cleanup
> > objects associated with hci_conn/hci_chan.
>
> I fully agree with your analysis, which correspond to mine: we should
> call "l2cap_conn_del()", it would properly clean the allocations in
> "l2cap_conn_add()".
> I have tried but it was not obvious to find the right place to call
> "l2cap_conn_del()" with the proper locking.
> As you write, connect_cfm/disconnect_cfm is not called when the
> connection is aborted, and that is the root cause of the memory leak.

Btw, I was trying to reproduce it with the following test set, but at
least kmemleak was not able to detect leaks of l2cap_conn:

https://patchwork.kernel.org/project/bluetooth/list/?series=784343

> Your proposal is most probably the best way to go.
>
> > I'd change hci_chan_create to take a del callback to avoid having
> > circular dependencies on one another though.
>
> Interesting, could you explain how you would do it? Perhaps point on
> a callback example?
>
> >> Reproducing with Buildroot
> >> ==========================
> >>
> >> We have reproduced and fixed the memory leak with Buildroot [7] and a
> >> "ble-memleak-repro" test application on an ST's Discovery Kit DK2 [4].
> >>
> >> The "ble-memleak-repro" repository [1] contains the sources of a
> >> complete external Buildroot customisation [8], with the patches, a
> >> README, and more explanations to reproduce the problem with Buildroot
> >> on a DK2.
> >>
> >> References:
> >> ===========
> >>
> >> - [1]: <https://gitlab.com/essensium-mind/ble-memleak-repro.git>
> >>         "ble-memleak-repro repository"
> [...]
>
> --
> Olivier L'Heureux



-- 
Luiz Augusto von Dentz




[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux