Re: [PATCH] Bluetooth: check L2CAP length in first ACL fragment

Andrei Emeltchenko <andrei.emeltchenko.news@xxxxxxxxx> · Mon, 6 Sep 2010 14:32:44 +0300




Hi,

On Fri, Sep 3, 2010 at 7:26 PM, Mat Martineau <mathewm@xxxxxxxxxxxxxx> wrote:
>> Trace below is received when using stress tools sending big
>> fragmented L2CAP packets.
>> ...
>> [ 1712.798492] swapper: page allocation failure. order:4, mode:0x4020
>> [ 1712.804809] [<c0031870>] (unwind_backtrace+0x0/0xdc) from [<c00a1f70>]
>> (__alloc_pages_nodemask+0x4)
>> [ 1712.814666] [<c00a1f70>] (__alloc_pages_nodemask+0x47c/0x4d4) from
>> [<c00a1fd8>] (__get_free_pages+)
>> [ 1712.824645] [<c00a1fd8>] (__get_free_pages+0x10/0x3c) from [<c026eb5c>]
>> (__alloc_skb+0x4c/0xfc)
>> [ 1712.833465] [<c026eb5c>] (__alloc_skb+0x4c/0xfc) from [<bf28c738>]
>> (l2cap_recv_acldata+0xf0/0x1f8 )
>> [ 1712.843322] [<bf28c738>] (l2cap_recv_acldata+0xf0/0x1f8 [l2cap]) from
>> [<bf0094ac>] (hci_rx_task+0x)
>> ...
>
> What is logged right before this allocation failure?  There should be a
> "Start: total len %d, frag len %d" line.  Actually, all log messages from
> l2cap_recv_acldata leading up to the failure would be helpful.

When I enable logs system behave differently because of huge traffic
to syslog :-)

>> After applying patch dmesg looks like:
>> ...
>> l2cap_recv_acldata: Frame exceeding recv MTU (len 64182, MTU 1013)
>> l2cap_recv_acldata: Unexpected continuation frame (len 34)
>> ...
>>
>> Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@xxxxxxxxx>
>> ---
>> net/bluetooth/l2cap.c |   11 +++++++++++
>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c
>> index 9fad312..21824d7 100644
>> --- a/net/bluetooth/l2cap.c
>> +++ b/net/bluetooth/l2cap.c
>> @@ -4654,6 +4654,8 @@ static int l2cap_recv_acldata(struct hci_conn *hcon,
>> struct sk_buff *skb, u16 fl
>>
>>        if (flags & ACL_START) {
>>                struct l2cap_hdr *hdr;
>> +               struct sock *sk;
>> +               u16 cid;
>>                int len;
>>
>>                if (conn->rx_len) {
>> @@ -4672,6 +4674,7 @@ static int l2cap_recv_acldata(struct hci_conn *hcon,
>> struct sk_buff *skb, u16 fl
>> d
>>                hdr = (struct l2cap_hdr *) skb->data;
>>                len = __le16_to_cpu(hdr->len) + L2CAP_HDR_SIZE;
>> +               cid = __le16_to_cpu(hdr->cid);
>
> Some stress tools also send L2CAP data one byte at a time - the start
> fragment may not have all four header bytes.  l2cap_recv_acldata() currently
> allows short start fragments with two or three bytes, which would not
> contain a valid CID.

Yes currently we allow 2 bytes in start fragment and check hdr->len.
What about following change (require 4 bytes in start fragment)

---------------------------------------------
-               if (skb->len < 2) {
+               if (skb->len < L2CAP_HDR_SIZE) {
                        BT_ERR("Frame is too short (len %d)", skb->len);
                        l2cap_conn_unreliable(conn, ECOMM);
                        goto drop;
---------------------------------------------

>>                if (len == skb->len) {
>>                        /* Complete frame received */
>> @@ -4688,6 +4691,14 @@ static int l2cap_recv_acldata(struct hci_conn
>> *hcon, struct sk_buff *skb, u16 fl
>>                        goto drop;
>>                }
>>
>> +               sk = l2cap_get_chan_by_scid(&conn->chan_list, cid);
>
> There are several issues here.
>
> l2cap_get_chan_by_scid() might return NULL, which will definitely happen
> with signaling or connectionless data frames.
>
> l2cap_get_chan_by_scid() also calls bh_lock_sock() if a channel is found,
> and I don't see that you added a matching call to bh_unlock_sock() if
> l2cap_get_chan_by_scid() returns a non-NULL value.

My mistake here. What about following check:

---------------------------------------
@@ -3916,6 +3919,19 @@ static int l2cap_recv_acldata(struct hci_conn
*hcon, struct sk_buff *skb,
                        goto drop;
                }

+               sk = l2cap_get_chan_by_scid(&conn->chan_list, cid);
+               if (!sk)
+                       goto drop;
+
+               if (l2cap_pi(sk)->imtu < len) {
+                       BT_ERR("Frame exceeding recv MTU (len %d, MTU %d)",
+                                       len, l2cap_pi(sk)->imtu);
+                       conn->rx_len = 0; /* needed? */
+                       bh_unlock_sock(sk);
+                       goto drop;
+               } else
+                       bh_unlock_sock(sk);
+
                /* Allocate skb for the complete frame (with header) */
                conn->rx_skb = bt_skb_alloc(len, GFP_ATOMIC);
                if (!conn->rx_skb)

--------------------------------------

>
>> +               if (l2cap_pi(sk)->imtu < len) {
>> +                       BT_ERR("Frame exceeding recv MTU (len %d, MTU
>> %d)",
>> +                                       len, l2cap_pi(sk)->imtu);
>> +                       conn->rx_len = 0; /* needed? */
>> +                       goto drop;
>> +               }
>> +
>>                /* Allocate skb for the complete frame (with header) */
>>                conn->rx_skb = bt_skb_alloc(len, GFP_ATOMIC);
>>                if (!conn->rx_skb)
>
> In general, this approach adds an extra channel lookup and an extra
> lock/unlock on every ACL start frame.

Yes this adds extra lock/unlock but only if the frame is not complete. There
shouldn't be many fragmented L2CAP packets.

>  Even if the MTU is known to be
> exceeded on with the start frame, no more than 64k is ever allocated (which
> shouldn't cause problems in itself).

I think that for the mobile device this can be a problem since this
shall be contiguous
memory.

> The root of the problem may be heap
> corruption due to a buffer overrun, which seems possible due to the crash
> deep in the memory allocator.

Regards,
Andrei
--
To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html