RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marcel,

> -----Original Message-----
> From: Marcel Holtmann <marcel@xxxxxxxxxxxx>
> Sent: Thursday, August 5, 2021 6:41 PM
> To: K, Kiran <kiran.k@xxxxxxxxx>
> Cc: BlueZ <linux-bluetooth@xxxxxxxxxxxxxxx>; Srivatsa, Ravishankar
> <ravishankar.srivatsa@xxxxxxxxx>; Tumkur Narayan, Chethan
> <chethan.tumkur.narayan@xxxxxxxxx>
> Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> command
> 
> Hi Kiran,
> 
> > For NOP command, need to cancel work scheduled on cmd_timer, on
> > receiving command status or commmand complete event.
> >
> > Below use case might lead to race condition multiple when NOP commands
> > are queued sequentially:
> >
> > hci_cmd_work() {
> >   if (atomic_read(&hdev->cmd_cnt) {
> >            .
> >            .
> >            .
> >      atomic_dec(&hdev->cmd_cnt);
> >      hci_send_frame(hdev,...);
> >      schedule_delayed_work(&hdev->cmd_timer,...);
> >   }
> > }
> >
> > On receiving event for first NOP, the work scheduled on
> > hdev->cmd_timer is not cancelled and  second NOP is dequeued and sent
> to controller.
> >
> > While waiting for an event for second NOP command, work scheduled on
> > cmd_timer for first NOP can get scheduled, resulting in sending third
> > NOP command not waiting for an event for second NOP. This might cause
> > issues at controller side (like memory overrun, controller going
> > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> >
> > Signed-off-by: Kiran K <kiran.k@xxxxxxxxx>
> > Reviewed-by: Chethan T N <chethan.tumkur.narayan@xxxxxxxxx>
> > Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@xxxxxxxxx>
> > ---
> > net/bluetooth/hci_event.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > index ea7fc09478be..14dfbdc8b81b 100644
> > --- a/net/bluetooth/hci_event.c
> > +++ b/net/bluetooth/hci_event.c
> > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct hci_dev
> *hdev,
> > 					    u16 opcode, u8 ncmd)
> > {
> > -	if (opcode != HCI_OP_NOP)
> > -		cancel_delayed_work(&hdev->cmd_timer);
> > +	cancel_delayed_work(&hdev->cmd_timer);
> >
> > 	if (!test_bit(HCI_RESET, &hdev->flags)) {
> > 		if (ncmd) {
> 
> so this is conflicting with the patch introducing the ncmd timeout handling.
> 
My patch specifically addresses the issue observed in case of NOP command. It prevents the issue by handling NOP same as any other SIG command.

It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to recover when controller goes bad.
  
> commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> Author: Manish Mandlik <mmandlik@xxxxxxxxxx>
> Date:   Thu Apr 29 10:24:22 2021 -0700
> 
>     Bluetooth: Add ncmd=0 recovery handling
> 
>     During command status or command complete event, the controller may
> set
>     ncmd=0 indicating that it is not accepting any more commands. In such a
>     case, host holds off sending any more commands to the controller. If the
>     controller doesn't recover from such condition, host will wait forever,
>     until the user decides that the Bluetooth is broken and may power cycles
>     the Bluetooth.
> 
>     This patch triggers the hardware error to reset the controller and
>     driver when it gets into such state as there is no other wat out.
> 
> Nowhere in your commit description you are addressing why is this the right
> to do.
>

Will fix it in the next version if you are OK with the current fix. Please let me know.

> Regards
> 
> Marcel

Thanks,
Kiran





[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux