Re: [PATCH] Bluetooth: fix dangling sco_conn and use-after-free in sco_sock_timeout

Ying Hsu <yinghsu@xxxxxxxxxxxx> · Sat, 26 Mar 2022 14:31:22 +0800

Hi Luiz,

I compiled and ran the c-reproducer:
https://syzkaller.appspot.com/x/repro.c?x=152b93e8700000
I will add relevant links in the commit message. Thanks for the reminder.

While fixing the use-after-free problem , I also found a possible
deadlock in sco_sock_connect() and sco_sock_getsockopt() :
sco_sock_connect() {
  hci_dev_lock(hdev);
  lock_sock(sk);
}

sco_sock_getsockopt() {
  lock_sock(sk);
  case BT_CODEC:
    hci_dev_lock(hdev);
}

So, adjusting the locking order in sco_sock_connect() can also avoid
the possible deadlock.

Ying

On Sat, Mar 26, 2022 at 2:50 AM Luiz Augusto von Dentz
<luiz.dentz@xxxxxxxxx> wrote:
>
> Hi Ying,
>
> On Thu, Mar 24, 2022 at 8:31 PM Ying Hsu <yinghsu@xxxxxxxxxxxx> wrote:
> >
> > Connecting the same socket twice consecutively in sco_sock_connect()
> > could lead to a race condition where two sco_conn objects are created
> > but only one is associated with the socket. If the socket is closed
> > before the SCO connection is established, the timer associated with the
> > dangling sco_conn object won't be canceled. As the sock object is being
> > freed, the use-after-free problem happens when the timer callback
> > function sco_sock_timeout() accesses the socket. Here's the call trace:
> >
> > dump_stack+0x107/0x163
> > ? refcount_inc+0x1c/
> > print_address_description.constprop.0+0x1c/0x47e
> > ? refcount_inc+0x1c/0x7b
> > kasan_report+0x13a/0x173
> > ? refcount_inc+0x1c/0x7b
> > check_memory_region+0x132/0x139
> > refcount_inc+0x1c/0x7b
> > sco_sock_timeout+0xb2/0x1ba
> > process_one_work+0x739/0xbd1
> > ? cancel_delayed_work+0x13f/0x13f
> > ? __raw_spin_lock_init+0xf0/0xf0
> > ? to_kthread+0x59/0x85
> > worker_thread+0x593/0x70e
> > kthread+0x346/0x35a
> > ? drain_workqueue+0x31a/0x31a
> > ? kthread_bind+0x4b/0x4b
> > ret_from_fork+0x1f/0x30
> >
> > Signed-off-by: Ying Hsu <yinghsu@xxxxxxxxxxxx>
> > Reviewed-by: Joseph Hwang <josephsih@xxxxxxxxxxxx>
> > ---
> > Tested this commit using a C reproducer on qemu-x86_64 for 8 hours.
>
> We should probably add a link or something to the reproducer then, was
> it syzbot? It does have some instructions on how to link its issues.
>
> >  net/bluetooth/sco.c | 21 +++++++++++++--------
> >  1 file changed, 13 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
> > index 8eabf41b2993..380c63194736 100644
> > --- a/net/bluetooth/sco.c
> > +++ b/net/bluetooth/sco.c
> > @@ -574,19 +574,24 @@ static int sco_sock_connect(struct socket *sock, struct sockaddr *addr, int alen
> >             addr->sa_family != AF_BLUETOOTH)
> >                 return -EINVAL;
> >
> > -       if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND)
> > -               return -EBADFD;
> > +       lock_sock(sk);
> > +       if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND) {
> > +               err = -EBADFD;
> > +               goto done;
> > +       }
> >
> > -       if (sk->sk_type != SOCK_SEQPACKET)
> > -               return -EINVAL;
> > +       if (sk->sk_type != SOCK_SEQPACKET) {
> > +               err = -EINVAL;
> > +               goto done;
> > +       }
> >
> >         hdev = hci_get_route(&sa->sco_bdaddr, &sco_pi(sk)->src, BDADDR_BREDR);
> > -       if (!hdev)
> > -               return -EHOSTUNREACH;
> > +       if (!hdev) {
> > +               err = -EHOSTUNREACH;
> > +               goto done;
> > +       }
> >         hci_dev_lock(hdev);
> >
> > -       lock_sock(sk);
> > -
>
> Also are we sure we are not introducing a locking hierarchy problem
> here? Previously we had hci_dev_lock then sock_lock now it is the
> opposite, or perhaps we never want to have them at the same time?
>
> >         /* Set destination address and psm */
> >         bacpy(&sco_pi(sk)->dst, &sa->sco_bdaddr);
> >
> > --
> > 2.35.1.1021.g381101b075-goog
> >
>
>
> --
> Luiz Augusto von Dentz