Hi, On Mon, Dec 16, 2024 at 8:36 AM Jiayang Mao <quic_jiaymao@xxxxxxxxxxx> wrote: > > Hi Luiz, > > On 2024/12/4 21:47, Jiayang Mao wrote: > > Hi Luiz, > > > > On 2024/12/4 1:28, Luiz Augusto von Dentz wrote: > >> Hi Jiayang, > >> > >> On Tue, Dec 3, 2024 at 12:19 PM Jiayang Mao <quic_jiaymao@xxxxxxxxxxx> > >> wrote: > >>> > >>> Hi Luiz, > >>> > >>> On 2024/12/3 4:41, Luiz Augusto von Dentz wrote: > >>>> Hi Jiayang, > >>>> > >>>> On Mon, Nov 25, 2024 at 12:51 PM Jiayang Mao > >>>> <quic_jiaymao@xxxxxxxxxxx> wrote: > >>>>> > >>>>> Clear the remaining command in cmd_sync_work_list when BT is > >>>>> performing power off. In some cases, this list is not empty after > >>>>> power off. BT host will try to send more HCI commands. > >>>>> This can cause unexpected results. > >>>> > >>>> What commands are in the queue? > >>> > >>> If turning off BT during pairing, "hci_acl_create_conn_sync" has chances > >>> to be left in the queue. Then the driver will try to send the HCI > >>> command of creating connection but failed. > >> > >> There shouldn't be happening though: > >> > >> /* Terminated due to Power Off */ > >> err = hci_disconnect_all_sync(hdev, HCI_ERROR_REMOTE_POWER_OFF); > >> if (err) > >> goto out; > >> > >> err = hci_dev_close_sync(hdev); > >> > >> Perhaps there is something attempting to connect after > >> hci_disconnect_all_sync has completed, in that case there is a bug > >> around this sequence or we need to check HCI_POWERING_DOWN to not > >> attempt to process the connection attempts. > >> > > After pairing, an L2CAP channel is to be created by l2cap_sock_create > > function. It eventually calls hci_connect_acl_sync, which adds > > hci_acl_create_conn_sync to the cmd_sync_work_list. > > > > The issue arises if BT is turned off after this addition but before > > hci_acl_create_conn_sync is execute. > > > > Your suggestion to check HCI_POWERING_DOWN seems more appropriate > > for addressing this issue. We can try incorporating this check into > > hci_acl_create_conn_sync. > > Thank you for your attention to this matter. Could you please help to > check my reply? Please let me know if there are any other concerns, or > if I should submit another change to check HCI_POWERING_DOWN in > hci_acl_create_conn_sync. Yes, please add a check for HCI_POWERING_DOWN, also if you could come with a test for mgmt-tester to test such a scenario it would be great since I'm not quite sure how you are able to reproduce this, or is this a test script that does bluetootctl> connect + power off? > > > >>>> > >>>>> Signed-off-by: Jiayang Mao <quic_jiaymao@xxxxxxxxxxx> > >>>>> --- > >>>>> net/bluetooth/hci_sync.c | 6 ++++++ > >>>>> 1 file changed, 6 insertions(+) > >>>>> > >>>>> diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c > >>>>> index c86f4e42e..bc622d074 100644 > >>>>> --- a/net/bluetooth/hci_sync.c > >>>>> +++ b/net/bluetooth/hci_sync.c > >>>>> @@ -5139,6 +5139,7 @@ int hci_dev_close_sync(struct hci_dev *hdev) > >>>>> { > >>>>> bool auto_off; > >>>>> int err = 0; > >>>>> + struct hci_cmd_sync_work_entry *entry, *tmp; > >>>>> > >>>>> bt_dev_dbg(hdev, ""); > >>>>> > >>>>> @@ -5258,6 +5259,11 @@ int hci_dev_close_sync(struct hci_dev *hdev) > >>>>> clear_bit(HCI_RUNNING, &hdev->flags); > >>>>> hci_sock_dev_event(hdev, HCI_DEV_CLOSE); > >>>>> > >>>>> + mutex_lock(&hdev->cmd_sync_work_lock); > >>>>> + list_for_each_entry_safe(entry, tmp, &hdev- > >>>>> >cmd_sync_work_list, list) > >>>>> + _hci_cmd_sync_cancel_entry(hdev, entry, -ECANCELED); > >>>>> + mutex_unlock(&hdev->cmd_sync_work_lock); > >>>> > >>>> Seems equivalent to hci_cmd_sync_clear, that said we should have been > >>>> running with that lock already, also if there is a sequence like > >>>> close/open the close may cancel the subsequent open, so I don't think > >>>> we should be canceling every subsequent callback like this. > >>> > >>> In hci_cmd_sync_clear, the work cmd_sync_work and reenable_adv_work are > >>> canceled. hci_cmd_sync_clear is not directly called because these two > >>> works should not be canceled during power off. > >>> Do you mean the added code should be moved to other functions to avoid > >>> the risk of lock? > >>> > >>> Yes. This change lacks considering sequence of close/open. I will update > >>> the implementation to ensure it does not remove the opening and the > >>> operations after re-opening. > >>>> > >>>>> /* After this point our queues are empty and no tasks are > >>>>> scheduled. */ > >>>>> hdev->close(hdev); > >>>>> > >>>>> -- > >>>>> 2.25.1 > >>>>> > >>>> > >>>> > >>> > >> > >> > > > Thanks, > Jiayang Mao -- Luiz Augusto von Dentz