Re: [PATCH v1 1/2] Bluetooth: btintel_pcie: Add handshake between driver and firmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 01, 2024 at 04:14:50PM +0530, Kiran K wrote:
> The following handshake mechanism needs be followed after firmware
> download is completed to bring the firmware to running state.
> 
> After firmware fragments of Operational image are downloaded and
> secure sends result of the image succeeds,
> 
> 1. Driver sends HCI Intel reset with boot option #1 to switch FW image.
> 2. FW sends Alive GP[0] MSIx
> 3. Driver enables data path (doorbell 0x460 for RBDs, etc...)
> 4. Driver gets Bootup event from firmware
> 5. Driver performs D0 entry to device (WRITE to IPC_Sleep_Control =0x0)
> 6. FW sends Alive GP[0] MSIx
> 7. Device host interface is fully set for BT protocol stack operation.
> 8. Driver may optionally get debug event with ID 0x97 which can be dropped
> 
> For Intermediate loadger image, all the above steps are applicable
> expcept #5 and #6.

I see this is already applied to bluetooth-next and is probably
immutable, but if not...

s/loadger/loader/
s/expcept/except/

But more importantly, the race below is still in linux-next as of
next-20241111.  I mentioned this before at
https://lore.kernel.org/all/20241023191352.GA924610@bhelgaas/#t, but
it probably got lost in the suspend/resume conversation.

> @@ -1053,8 +1272,11 @@ static int btintel_pcie_send_frame(struct hci_dev *hdev,
>  				       struct sk_buff *skb)
>  {
>  	struct btintel_pcie_data *data = hci_get_drvdata(hdev);
> +	struct hci_command_hdr *cmd;
> +	__u16 opcode = ~0;
>  	int ret;
>  	u32 type;
> +	u32 old_ctxt;
>  
>  	/* Due to the fw limitation, the type header of the packet should be
>  	 * 4 bytes unlike 1 byte for UART. In UART, the firmware can read
> @@ -1073,6 +1295,8 @@ static int btintel_pcie_send_frame(struct hci_dev *hdev,
>  	switch (hci_skb_pkt_type(skb)) {
>  	case HCI_COMMAND_PKT:
>  		type = BTINTEL_PCIE_HCI_CMD_PKT;
> +		cmd = (void *)skb->data;
> +		opcode = le16_to_cpu(cmd->opcode);
>  		if (btintel_test_flag(hdev, INTEL_BOOTLOADER)) {
>  			struct hci_command_hdr *cmd = (void *)skb->data;
>  			__u16 opcode = le16_to_cpu(cmd->opcode);
> @@ -1111,6 +1335,30 @@ static int btintel_pcie_send_frame(struct hci_dev *hdev,
>  		bt_dev_err(hdev, "Failed to send frame (%d)", ret);
>  		goto exit_error;
>  	}
> +
> +	if (type == BTINTEL_PCIE_HCI_CMD_PKT &&
> +	    (opcode == HCI_OP_RESET || opcode == 0xfc01)) {
> +		old_ctxt = data->alive_intr_ctxt;
> +		data->alive_intr_ctxt =
> +			(opcode == 0xfc01 ? BTINTEL_PCIE_INTEL_HCI_RESET1 :
> +				BTINTEL_PCIE_HCI_RESET);
> +		bt_dev_dbg(data->hdev, "sent cmd: 0x%4.4x alive context changed: %s  ->  %s",
> +			   opcode, btintel_pcie_alivectxt_state2str(old_ctxt),
> +			   btintel_pcie_alivectxt_state2str(data->alive_intr_ctxt));
> +		if (opcode == HCI_OP_RESET) {
> +			data->gp0_received = false;
> +			ret = wait_event_timeout(data->gp0_wait_q,
> +						 data->gp0_received,
> +						 msecs_to_jiffies(BTINTEL_DEFAULT_INTR_TIMEOUT_MS));

This looks like a race.  You're apparently started something
previously that will eventually result in data->gp0_received being
set.  The ordering you expect is this:

  1) Tell device to do something and interrupt on completion
  2) Set data->gp0_received = false here
  3) Call wait_event_timeout() here
  4) Completion interrupt causes btintel_pcie_msix_gp0_handler() to be
     called
  5) btintel_pcie_msix_gp0_handler() sets data->gp0_received = true
  6) wait_event_timeout() completes

But this ordering can also happen:

  1) Tell device to do something and interrupt on completion
  2) Completion interrupt causes
     btintel_pcie_msix_gp0_handler() to be called
  3) btintel_pcie_msix_gp0_handler() sets data->gp0_received = true
  4) Set data->gp0_received = false here, overwriting the "true"
  5) Call wait_event_timeout() here
  6) wait_event_timeout() times out because completion interrupt has
     already happened
  7) We complain "No alive interrupt received" here

You should be able to force this failure by adding a sleep before
setting data->gp0_received = false in this patch.

> +			if (!ret) {
> +				hdev->stat.err_tx++;
> +				bt_dev_err(hdev, "No alive interrupt received for %s",
> +					   btintel_pcie_alivectxt_state2str(data->alive_intr_ctxt));
> +				ret = -ETIME;
> +				goto exit_error;
> +			}
> +		}
> +	}
>  	hdev->stat.byte_tx += skb->len;
>  	kfree_skb(skb);




[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux