Re: regression in c7f59461f5a78: Bluetooth: Fix a refcnt underflow problem for hci_conn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrei,

On Mon, Jan 22, 2024 at 10:50 AM Andrei Volkov
<andrey.volkov@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi Luiz,
>
> Le 22/01/2024 à 16:10, Luiz Augusto von Dentz a écrit :
>
> > Hi Andrei,
> >
> > On Mon, Jan 22, 2024 at 9:45 AM Andrei Volkov
> > <andrey.volkov@xxxxxxxxxxxxxxxxx> wrote:
> >> Hi Luiz,
> >>
> >> Wouldn't it be better to add a 'yet-another' flag as an addition to
> >> HCI_CONN_SSP_ENABLED, and clear it unconditionally at the top of
> >> 'hci_remote_ext_features_evt', before
> >>
> >> "        conn = hci_conn_hash_lookup_handle"
> >>
> >> check?
> >>
> >> In this case broken ext_features response (with ev->status != 0 or conn
> >> == NULL) will not indirectly enable the SSP feature.
> > HCI_CONN_SSP_ENABLED is already at conn level, besides that it is too
> > late to clear it if the SSP procedure has already taken place and on
> > top of it I don't want to change the code too much and risk having
> > more regressions like this.
> >
> > Btw, something tells me that Android is not actually doing the
> > HCI_OP_READ_REMOTE_EXT_FEATURES since we do have CI testing SSP and
> > this change hasn't cause us any problems, do you know what command it
> > uses? Perhaps it tries SSP right-away or does it cache the previous
> > response?
> >
> The real problem is that since the ext. device does not receive a
> response to the IO request (cmd 0x31) from BlueZ, it refuses to continue
> the pairing. But with proposed reverting the ssp check, we are probably
> coming back to the initial href problem.
>
> Btw, for me it's unclear how the additional check in
> "hci_io_capa_request_evt" helps with href undderrun. It looks like the
> original FSM implementation is missing something, and the fix actually
> is masking the problem.

That is a totally different problem, most likely related to
conn->disc_work not being cancelled properly or something, so this
probably was only masking it for some reason but it is not a proper
fix, anyway it is still a good idea to check if hdev has SSP enabled
in any case, since it can be enabled/disabled at runtime thus why I'm
not completely reverting it.

If the conn_timeout problem comes back then we need to investigate
what is really causing that, but there is a high chance this has been
fixed since we have been reworking some of the code paths related to
hci_conn_del, etc.

> Regards
> Andrei VOLKOV
>
> >> Regards
> >> Andrei VOLKOV
> >>
> >> Le 22/01/2024 à 15:02, Luiz Augusto von Dentz a écrit :
> >>
> >>> Hi Andrei,
> >>>
> >>> On Mon, Jan 22, 2024 at 7:18 AM Andrei Volkov
> >>> <andrey.volkov@xxxxxxxxxxxxxxxxx> wrote:
> >>>> Hello,
> >>>>
> >>>> Lately we've bumped with regression introduced by commit:
> >>>>
> >>>>     c7f59461f5a78 ("Bluetooth: Fix a refcnt underflow problem for
> >>>> hci_conn", 2023-10-04)
> >>>>
> >>>> The regression related with adding "hci_conn_ssp_enabled()" check in
> >>>> "hci_io_capa_request_evt()" handler, and broke pairing process initiated
> >>>> by the external device.
> >>>>
> >>>> Precisely, some ext. devices, like any phone equipped with Android ver <
> >>>> 14 (we have not latest one, so we didn't check), always send "IO
> >>>> Capability Request" before "Read Remote Extended Features" command, as
> >>>> consequence the flag "HCI_CONN_SSP_ENABLED" not yet activated at the
> >>>> time of "hci_io_capa_request_evt()" execution  and
> >>>> "hci_conn_ssp_enabled()" always returns false in time of the pairing.
> >>>>
> >>>> As a result, pairing always fails. The quick and dirty fix is revert the
> >>>> ssp check introduced in the subj. commit, like below:
> >>>>
> >>>> --- a/net/bluetooth/hci_event.c
> >>>> +++ b/net/bluetooth/hci_event.c
> >>>> @@ -5329,7 +5329,7 @@ static void hci_io_capa_request_evt(struct hci_dev
> >>>> *hdev, void *data,
> >>>>            hci_dev_lock(hdev);
> >>>>
> >>>>            conn = hci_conn_hash_lookup_ba(hdev, ACL_LINK, &ev->bdaddr);
> >>>> -       if (!conn || !hci_conn_ssp_enabled(conn))
> >>>> +       if (!conn)
> >>>>                    goto unlock;
> >>>>
> >>>>            hci_conn_hold(conn);
> >>>>
> >>>>
> >>>> However, a more thorough and correct fix requires discussion and
> >>>> testing. Therefore, I would like to get any comments/suggestion from the
> >>>> community before doing this.
> >>> I think we need to do something like this, so we consider only the
> >>> local SSP flag when evaluating if we should proceed to respond or just
> >>> drop:
> >>>
> >>> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> >>> index 6130c969f361..a15924db83d9 100644
> >>> --- a/net/bluetooth/hci_event.c
> >>> +++ b/net/bluetooth/hci_event.c
> >>> @@ -5327,9 +5327,12 @@ static void hci_io_capa_request_evt(struct
> >>> hci_dev *hdev, void *data,
> >>>           hci_dev_lock(hdev);
> >>>
> >>>           conn = hci_conn_hash_lookup_ba(hdev, ACL_LINK, &ev->bdaddr);
> >>> -       if (!conn || !hci_conn_ssp_enabled(conn))
> >>> +       if (!conn || !hci_dev_test_flag(hdev, HCI_SSP_ENABLED))
> >>>                   goto unlock;
> >>>
> >>> +       /* Assume remote supports SSP since it has triggered this event */
> >>> +       set_bit(HCI_CONN_SSP_ENABLED, &conn->flags);
> >>> +
> >>>           hci_conn_hold(conn);
> >>>
> >>>           if (!hci_dev_test_flag(hdev, HCI_MGMT))
> >>>
> >>>
> >>>> Regards
> >>>> Andrey VOLKOV
> >>>>
> >>>>
> >
> >



-- 
Luiz Augusto von Dentz





[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux