On 29/8/21 10:53 pm, Desmond Cheong Zhi Xi wrote:
On 29/8/21 4:29 pm, Hillf Danton wrote:
On Fri, 27 Aug 2021 15:58:34 +0800 Desmond Cheong Zhi Xi wrote:
On 27/8/21 9:19 am, Hillf Danton wrote:
On Thu, 26 Aug 2021 09:29:24 -0700
syzbot found the following issue on:
HEAD commit: e3f30ab28ac8 Merge branch 'pktgen-samples-next'
git tree: net-next
console output:
https://syzkaller.appspot.com/x/log.txt?x=13249c96300000
kernel config:
https://syzkaller.appspot.com/x/.config?x=ef482942966bf763
dashboard link:
https://syzkaller.appspot.com/bug?extid=2bef95d3ab4daa10155b
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU
Binutils for Debian) 2.35.1
syz repro:
https://syzkaller.appspot.com/x/repro.syz?x=16a29ea9300000
The issue was bisected to:
commit e1dee2c1de2b4dd00eb44004a4bda6326ed07b59
Author: Desmond Cheong Zhi Xi <desmondcheongzx@xxxxxxxxx>
Date: Tue Aug 10 04:14:10 2021 +0000
Bluetooth: fix repeated calls to sco_sock_kill
To fix the uaf, grab another hold to sock to make the timeout work safe.
#syz test:
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
e3f30ab28ac8
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -190,15 +190,14 @@ static void sco_conn_del(struct hci_conn
sco_conn_unlock(conn);
if (sk) {
- sock_hold(sk);
lock_sock(sk);
sco_sock_clear_timer(sk);
sco_chan_del(sk, err);
release_sock(sk);
- sock_put(sk);
/* Ensure no more work items will run before freeing conn. */
cancel_delayed_work_sync(&conn->timeout_work);
+ sock_put(sk);
Hi Hillf,
Saw that this passed the reproducer. But on closer inspection, I think
what's happening is that sco_conn_del is never run.
So the extra sock_hold prevents a UAF, but that's because now the
reference count never goes to 0. In my opinion, something closer to your
previous proposal (+ also addressing other calls to __sco_sock_close)
where we call cancel_delayed_work_sync after the channel is deleted
would address the root cause better.
Just my two cents.
Ok I went back to make a more thorough audit. Even without calling
cancel_delayed_work_sync, sco_sock_timeout should not cause a UAF.
I believe the real issue is that we can allocate a connection twice in
sco_connect. This means that the first connection gets lost and we're
unable to clean it up properly.
Thoughts on this?
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git e3f30ab28ac8
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -578,9 +578,6 @@ static int sco_sock_connect(struct socket *sock, struct sockaddr *addr, int alen
addr->sa_family != AF_BLUETOOTH)
return -EINVAL;
- if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND)
- return -EBADFD;
-
if (sk->sk_type != SOCK_SEQPACKET)
return -EINVAL;
@@ -591,6 +588,13 @@ static int sco_sock_connect(struct socket *sock, struct sockaddr *addr, int alen
lock_sock(sk);
+ if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND) {
+ hci_dev_unlock(hdev);
+ hci_dev_put(hdev);
+ err = -EBADFD;
+ goto done;
+ }
+
/* Set destination address and psm */
bacpy(&sco_pi(sk)->dst, &sa->sco_bdaddr);