Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Manish,

On Thu, Aug 12, 2021 at 3:58 AM K, Kiran <kiran.k@xxxxxxxxx> wrote:
>
> Hi Marcel,
>
> > -----Original Message-----
> > From: K, Kiran
> > Sent: Friday, August 6, 2021 8:14 PM
> > To: 'Marcel Holtmann' <marcel@xxxxxxxxxxxx>
> > Cc: BlueZ <linux-bluetooth@xxxxxxxxxxxxxxx>; Srivatsa, Ravishankar
> > <ravishankar.srivatsa@xxxxxxxxx>; Tumkur Narayan, Chethan
> > <chethan.tumkur.narayan@xxxxxxxxx>
> > Subject: RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > command
> >
> > Hi Marcel,
> >
> > > -----Original Message-----
> > > From: Marcel Holtmann <marcel@xxxxxxxxxxxx>
> > > Sent: Thursday, August 5, 2021 6:41 PM
> > > To: K, Kiran <kiran.k@xxxxxxxxx>
> > > Cc: BlueZ <linux-bluetooth@xxxxxxxxxxxxxxx>; Srivatsa, Ravishankar
> > > <ravishankar.srivatsa@xxxxxxxxx>; Tumkur Narayan, Chethan
> > > <chethan.tumkur.narayan@xxxxxxxxx>
> > > Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > > command
> > >
> > > Hi Kiran,
> > >
> > > > For NOP command, need to cancel work scheduled on cmd_timer, on
> > > > receiving command status or commmand complete event.
> > > >
> > > > Below use case might lead to race condition multiple when NOP
> > > > commands are queued sequentially:
> > > >
> > > > hci_cmd_work() {
> > > >   if (atomic_read(&hdev->cmd_cnt) {
> > > >            .
> > > >            .
> > > >            .
> > > >      atomic_dec(&hdev->cmd_cnt);
> > > >      hci_send_frame(hdev,...);
> > > >      schedule_delayed_work(&hdev->cmd_timer,...);
> > > >   }
> > > > }
> > > >
> > > > On receiving event for first NOP, the work scheduled on
> > > > hdev->cmd_timer is not cancelled and  second NOP is dequeued and
> > > > hdev->sent
> > > to controller.
> > > >
> > > > While waiting for an event for second NOP command, work scheduled on
> > > > cmd_timer for first NOP can get scheduled, resulting in sending
> > > > third NOP command not waiting for an event for second NOP. This
> > > > might cause issues at controller side (like memory overrun,
> > > > controller going
> > > > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> > > >
> > > > Signed-off-by: Kiran K <kiran.k@xxxxxxxxx>
> > > > Reviewed-by: Chethan T N <chethan.tumkur.narayan@xxxxxxxxx>
> > > > Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@xxxxxxxxx>
> > > > ---
> > > > net/bluetooth/hci_event.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > > > index ea7fc09478be..14dfbdc8b81b 100644
> > > > --- a/net/bluetooth/hci_event.c
> > > > +++ b/net/bluetooth/hci_event.c
> > > > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > > > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct
> > > > hci_dev
> > > *hdev,
> > > >                                       u16 opcode, u8 ncmd)
> > > > {
> > > > - if (opcode != HCI_OP_NOP)
> > > > -         cancel_delayed_work(&hdev->cmd_timer);
> > > > + cancel_delayed_work(&hdev->cmd_timer);
> > > >
> > > >   if (!test_bit(HCI_RESET, &hdev->flags)) {
> > > >           if (ncmd) {
> > >
> > > so this is conflicting with the patch introducing the ncmd timeout handling.
> > >
> > My patch specifically addresses the issue observed in case of NOP command.
> > It prevents the issue by handling NOP same as any other SIG command.
> >
> > It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to
> > recover when controller goes bad.
> >
>
> Do you have any further comments here ? Waiting for your input.
>
> > > commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> > > Author: Manish Mandlik <mmandlik@xxxxxxxxxx>
> > > Date:   Thu Apr 29 10:24:22 2021 -0700
> > >
> > >     Bluetooth: Add ncmd=0 recovery handling
> > >
> > >     During command status or command complete event, the controller
> > > may set
> > >     ncmd=0 indicating that it is not accepting any more commands. In such a
> > >     case, host holds off sending any more commands to the controller. If the
> > >     controller doesn't recover from such condition, host will wait forever,
> > >     until the user decides that the Bluetooth is broken and may power cycles
> > >     the Bluetooth.
> > >
> > >     This patch triggers the hardware error to reset the controller and
> > >     driver when it gets into such state as there is no other wat out.
> > >
> > > Nowhere in your commit description you are addressing why is this the
> > > right to do.
> > >
> >
> > Will fix it in the next version if you are OK with the current fix. Please let me
> > know.

Can you confirm this change doesn't break your patch above?

> >
> > > Regards
> > >
> > > Marcel
> >
> > Thanks,
> > Kiran
>
> Thanks,
> Kiran
>
>


-- 
Luiz Augusto von Dentz



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux