RE: [PATCH v1 1/2] Bluetooth: btintel_pcie: Add handshake between driver and firmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bjorn,

>-----Original Message-----
>From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
>Sent: Monday, November 11, 2024 11:59 PM
>To: K, Kiran <kiran.k@xxxxxxxxx>
>Cc: linux-bluetooth@xxxxxxxxxxxxxxx; Srivatsa, Ravishankar
><ravishankar.srivatsa@xxxxxxxxx>; Tumkur Narayan, Chethan
><chethan.tumkur.narayan@xxxxxxxxx>; Devegowda, Chandrashekar
><chandrashekar.devegowda@xxxxxxxxx>
>Subject: Re: [PATCH v1 1/2] Bluetooth: btintel_pcie: Add handshake between
>driver and firmware
>
>On Tue, Oct 01, 2024 at 04:14:50PM +0530, Kiran K wrote:
>> The following handshake mechanism needs be followed after firmware
>> download is completed to bring the firmware to running state.
>>
>> After firmware fragments of Operational image are downloaded and
>> secure sends result of the image succeeds,
>>
>> 1. Driver sends HCI Intel reset with boot option #1 to switch FW image.
>> 2. FW sends Alive GP[0] MSIx
>> 3. Driver enables data path (doorbell 0x460 for RBDs, etc...) 4.
>> Driver gets Bootup event from firmware 5. Driver performs D0 entry to
>> device (WRITE to IPC_Sleep_Control =0x0) 6. FW sends Alive GP[0] MSIx
>> 7. Device host interface is fully set for BT protocol stack operation.
>> 8. Driver may optionally get debug event with ID 0x97 which can be
>> dropped
>>
>> For Intermediate loadger image, all the above steps are applicable
>> expcept #5 and #6.
>
>I see this is already applied to bluetooth-next and is probably immutable, but
>if not...
>
>s/loadger/loader/
>s/expcept/except/
>
>But more importantly, the race below is still in linux-next as of next-20241111.
>I mentioned this before at
>https://lore.kernel.org/all/20241023191352.GA924610@bhelgaas/#t, but it
>probably got lost in the suspend/resume conversation.
>
>> @@ -1053,8 +1272,11 @@ static int btintel_pcie_send_frame(struct hci_dev
>*hdev,
>>  				       struct sk_buff *skb)
>>  {
>>  	struct btintel_pcie_data *data = hci_get_drvdata(hdev);
>> +	struct hci_command_hdr *cmd;
>> +	__u16 opcode = ~0;
>>  	int ret;
>>  	u32 type;
>> +	u32 old_ctxt;
>>
>>  	/* Due to the fw limitation, the type header of the packet should be
>>  	 * 4 bytes unlike 1 byte for UART. In UART, the firmware can read @@
>> -1073,6 +1295,8 @@ static int btintel_pcie_send_frame(struct hci_dev *hdev,
>>  	switch (hci_skb_pkt_type(skb)) {
>>  	case HCI_COMMAND_PKT:
>>  		type = BTINTEL_PCIE_HCI_CMD_PKT;
>> +		cmd = (void *)skb->data;
>> +		opcode = le16_to_cpu(cmd->opcode);
>>  		if (btintel_test_flag(hdev, INTEL_BOOTLOADER)) {
>>  			struct hci_command_hdr *cmd = (void *)skb->data;
>>  			__u16 opcode = le16_to_cpu(cmd->opcode); @@ -
>1111,6 +1335,30 @@
>> static int btintel_pcie_send_frame(struct hci_dev *hdev,
>>  		bt_dev_err(hdev, "Failed to send frame (%d)", ret);
>>  		goto exit_error;
>>  	}
>> +
>> +	if (type == BTINTEL_PCIE_HCI_CMD_PKT &&
>> +	    (opcode == HCI_OP_RESET || opcode == 0xfc01)) {
>> +		old_ctxt = data->alive_intr_ctxt;
>> +		data->alive_intr_ctxt =
>> +			(opcode == 0xfc01 ? BTINTEL_PCIE_INTEL_HCI_RESET1
>:
>> +				BTINTEL_PCIE_HCI_RESET);
>> +		bt_dev_dbg(data->hdev, "sent cmd: 0x%4.4x alive context
>changed: %s  ->  %s",
>> +			   opcode, btintel_pcie_alivectxt_state2str(old_ctxt),
>> +			   btintel_pcie_alivectxt_state2str(data-
>>alive_intr_ctxt));
>> +		if (opcode == HCI_OP_RESET) {
>> +			data->gp0_received = false;
>> +			ret = wait_event_timeout(data->gp0_wait_q,
>> +						 data->gp0_received,
>> +
>msecs_to_jiffies(BTINTEL_DEFAULT_INTR_TIMEOUT_MS));
>
>This looks like a race.  You're apparently started something previously that will
>eventually result in data->gp0_received being set.  The ordering you expect is
>this:
>
>  1) Tell device to do something and interrupt on completion
>  2) Set data->gp0_received = false here
>  3) Call wait_event_timeout() here
>  4) Completion interrupt causes btintel_pcie_msix_gp0_handler() to be
>     called
>  5) btintel_pcie_msix_gp0_handler() sets data->gp0_received = true
>  6) wait_event_timeout() completes
>
>But this ordering can also happen:
>
>  1) Tell device to do something and interrupt on completion
>  2) Completion interrupt causes
>     btintel_pcie_msix_gp0_handler() to be called
>  3) btintel_pcie_msix_gp0_handler() sets data->gp0_received = true
>  4) Set data->gp0_received = false here, overwriting the "true"
>  5) Call wait_event_timeout() here
>  6) wait_event_timeout() times out because completion interrupt has
>     already happened
>  7) We complain "No alive interrupt received" here
>
Possible. Unfortunately, this issue did not show up in our test. I will reproduce this issue and publish the same.
>You should be able to force this failure by adding a sleep before setting data-
>>gp0_received = false in this patch.

Ack.


>
>> +			if (!ret) {
>> +				hdev->stat.err_tx++;
>> +				bt_dev_err(hdev, "No alive interrupt received
>for %s",
>> +					   btintel_pcie_alivectxt_state2str(data-
>>alive_intr_ctxt));
>> +				ret = -ETIME;
>> +				goto exit_error;
>> +			}
>> +		}
>> +	}
>>  	hdev->stat.byte_tx += skb->len;
>>  	kfree_skb(skb);

Thanks,
Kiran






[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux