Re: [PATCH v1] Bluetooth: hci_sync: Fix disconnect complete event timeout issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Cheng,

On Mon, Dec 16, 2024 at 9:32 AM Luiz Augusto von Dentz
<luiz.dentz@xxxxxxxxx> wrote:
>
> Hi Cheng,
>
> On Mon, Dec 16, 2024 at 3:08 AM Cheng Jiang <quic_chejiang@xxxxxxxxxxx> wrote:
> >
> > Sometimes, the remote device doesn't acknowledge the LL_TERMINATE_IND
> > in time, requiring the controller to wait for the supervision timeout,
> > which may exceed 2 seconds. In the current implementation, the
> > HCI_EV_DISCONN_COMPLETE event is ignored if it arrives late, since
> > the hci_abort_conn_sync has cleaned up the connection after 2 seconds.
> > This causes the mgmt to get stuck, resulting in bluetoothd waiting
> > indefinitely for the mgmt response to the disconnect. To recover,
> > restarting bluetoothd is necessary.
> >
> > bluetoothctl log like this:
> > [Designer Mouse]# disconnect D9:B5:6C:F2:51:91
> > Attempting to disconnect from D9:B5:6C:F2:51:91
> > [Designer Mouse]#
> > [Designer Mouse]# power off
> > [Designer Mouse]#
> > Failed to set power off: org.freedesktop.DBus.Error.NoReply.
> >
> > Signed-off-by: Cheng Jiang <quic_chejiang@xxxxxxxxxxx>
> > ---
> >  include/net/bluetooth/hci_core.h |  2 ++
> >  net/bluetooth/hci_conn.c         |  9 +++++++++
> >  net/bluetooth/hci_event.c        |  9 +++++++++
> >  net/bluetooth/hci_sync.c         | 18 ++++++++++++++++++
> >  4 files changed, 38 insertions(+)
> >
> > diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
> > index 734cd50cd..2ab079dcf 100644
> > --- a/include/net/bluetooth/hci_core.h
> > +++ b/include/net/bluetooth/hci_core.h
> > @@ -753,6 +753,8 @@ struct hci_conn {
> >
> >         struct bt_codec codec;
> >
> > +       struct completion disc_ev_comp;
> > +
> >         void (*connect_cfm_cb)  (struct hci_conn *conn, u8 status);
> >         void (*security_cfm_cb) (struct hci_conn *conn, u8 status);
> >         void (*disconn_cfm_cb)  (struct hci_conn *conn, u8 reason);
> > diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> > index d097e308a..e0244e191 100644
> > --- a/net/bluetooth/hci_conn.c
> > +++ b/net/bluetooth/hci_conn.c
> > @@ -1028,6 +1028,15 @@ static struct hci_conn *__hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t
> >
> >         hci_conn_init_sysfs(conn);
> >
> > +       /* This disc_ev_comp is inited when we send a disconnect request to
> > +        * the remote device but fail to receive the disconnect complete
> > +        * event within the expected time (2 seconds). This occurs because
> > +        * the remote device doesn't ack the terminate indication, forcing
> > +        * the controller to wait for the supervision timeout.
> > +        */
> > +       init_completion(&conn->disc_ev_comp);
> > +       complete(&conn->disc_ev_comp);
> > +
> >         return conn;
> >  }
> >
> > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > index 2cc7a9306..60ecb2b18 100644
> > --- a/net/bluetooth/hci_event.c
> > +++ b/net/bluetooth/hci_event.c
> > @@ -3366,6 +3366,15 @@ static void hci_disconn_complete_evt(struct hci_dev *hdev, void *data,
> >         if (!conn)
> >                 goto unlock;
> >
> > +       /* Wake up disc_ev_comp here is ok. Since we hold the hdev lock
> > +        * hci_abort_conn_sync will wait hdev lock release to continue.
> > +        */
> > +       if (!completion_done(&conn->disc_ev_comp)) {
> > +               complete(&conn->disc_ev_comp);
> > +               /* Add some delay for hci_abort_conn_sync to handle the complete */
> > +               usleep_range(100, 1000);
> > +       }
> > +
> >         if (ev->status) {
> >                 mgmt_disconnect_failed(hdev, &conn->dst, conn->type,
> >                                        conn->dst_type, ev->status);
> > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
> > index 0badec712..783d04b57 100644
> > --- a/net/bluetooth/hci_sync.c
> > +++ b/net/bluetooth/hci_sync.c
> > @@ -5590,6 +5590,24 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
> >                 break;
> >         }
> >
> > +       /* Check whether the connection is successfully disconnected.
> > +        * Sometimes the remote device doesn't acknowledge the
> > +        * LL_TERMINATE_IND in time, requiring the controller to wait
> > +        * for the supervision timeout, which may exceed 2 seconds. In
> > +        * this case, we need to wait for the HCI_EV_DISCONN_COMPLETE
> > +        * event before cleaning up the connection.
> > +        */
> > +       if (err == -ETIMEDOUT) {
> > +               u32 idle_delay = msecs_to_jiffies(10 * conn->le_supv_timeout);
> > +
> > +               reinit_completion(&conn->disc_ev_comp);
> > +               if (!wait_for_completion_timeout(&conn->disc_ev_comp, idle_delay)) {
> > +                       bt_dev_warn(hdev, "Failed to get complete");
> > +                       mgmt_disconnect_failed(hdev, &conn->dst, conn->type,
> > +                                              conn->dst_type, conn->abort_reason);
> > +               }
> > +       }
>
> Why don't we just set the supervision timeout as timeout then? If we
> will have to wait for it anyway just change hci_disconnect_sync to use
> 10 * conn->le_supv_timeout as timeout instead.
>
> That said, we really need to fix bluetoothd if it is not able to be
> cleaned up if SET_POWERED command fails, but it looks like it is
> handling errors correctly so it sounds like something else is at play.

I double checked this and apparently this could no longer fail:

+               /* Disregard possible errors since hci_conn_del shall have been
+                * called even in case of errors had occurred since it would
+                * then cause hci_conn_failed to be called which calls
+                * hci_conn_del internally.
+                */
+               hci_abort_conn_sync(hdev, conn, reason);

So it will clean up the hci_conn no matter what is the timeout, so
either you don't have this change or it is not working for some
reason.

> >         hci_dev_lock(hdev);
> >
> >         /* Check if the connection has been cleaned up concurrently */
> >
> > base-commit: e25c8d66f6786300b680866c0e0139981273feba
> > --
> > 2.34.1
> >
>
>
> --
> Luiz Augusto von Dentz



-- 
Luiz Augusto von Dentz





[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux