ISO flow control issues with Intel AX210 fw build 79483

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The behavior of HCI ISO Data Packet flow control on Intel AX210 seems to
have changed between firmware versions 23.20.0.3 (fw build REL75234,
Jan 2024) and 23.40.0.2 (fw build REL79483, Mar 2024).

In practice, with this firmware version TWS playback with two CIS
essentially always loses one of the two streams, as kernel stops sending
further data to the controller for one of the two.

This does not occur with the earlier firmware version.

The new behavior looks as if the controller is failing to send Number of
Completed Packets events for some of the ISO data packets it actually
has processed (which would seem wrong, Core v5.4 Vol 4 Part E §4.1.1),
causing the kernel to stop queueing further data.

btsnoop log when streaming data to two CIS:
https://gitlab.freedesktop.org/pvir/repros/-/raw/main/2024-04-22_1.log?inline=false

***

Looking at `btmon -r 2024-04-22_1.log` output:

Initially it looks fine:
```
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #1283 [hci1] 17.716197
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #1284 [hci1] 17.716214
> HCI Event: Number of Completed Packets (0x13) plen 9   #1285 [hci1] 17.716529
        Num handles: 2
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #1280: len 121 (138 Kb/s)
        Latency: 7 msec (5-8 msec ~7 msec)
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #1281: len 121 (138 Kb/s)
        Latency: 7 msec (5-8 msec ~7 msec)
```
After physically covering one of the receiver devices briefly, which appears to
temporarily reduce the connection quality:
```
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #1430 [hci1] 18.083668
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #1431 [hci1] 18.083684
> HCI Event: Number of Completed Packets (0x13) plen 9   #1432 [hci1] 18.083660
        Num handles: 2
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #1425: len 121 (69 Kb/s)
        Latency: 14 msec (5-15 msec ~15 msec)
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #1427: len 121 (138 Kb/s)
        Latency: 7 msec (5-8 msec ~7 msec)
```
The latencies stay constant after this. In this situation when the two
CIS have different latency (as seen in btmon), in the earlier firmware
the CIS playback was desynchronized
(cf https://github.com/bluez/bluez/issues/515), but in this instance
they remain synchronized despite the latency difference in the NCP
events.

Note however that CIS desynchronization also still occurs with this
firmware revision, but we can no longer compensate for it on application
side, as the NCP events no longer correspond to this.

Disrupting connectivity again increases the NCP latency of handle 2304
further to 30ms (around #2630), and yet again (around #4487) to 50ms,
with the CIS playback remaining still in sync:
```
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #2639 [hci1] 21.106203
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #2640 [hci1] 21.106416
> HCI Event: Number of Completed Packets (0x13) plen 9   #2641 [hci1] 21.106660
        Num handles: 2
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #2628: len 121 (32 Kb/s)
        Latency: 30 msec (5-30 msec ~30 msec)
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #2636: len 121 (138 Kb/s)
        Latency: 7 msec (5-8 msec ~7 msec)
```
A fourth time results to packets for Handle 2304 no longer being
reported complete, and kernel stops scheduling more as it thinks there
are already several packets waiting in controller buffers:
```
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #6712 [hci1] 31.298684
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #6713 [hci1] 31.298693
> HCI Event: Number of Completed Packets (0x13) plen 5   #6714 [hci1] 31.306659
        Num handles: 1
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #6695: len 121 (18 Kb/s)
        Latency: 52 msec (5-74 msec ~52 msec)
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #6715 [hci1] 31.306713
> HCI Event: Number of Completed Packets (0x13) plen 5   #6716 [hci1] 31.313659
        Num handles: 1
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #6715: len 121 (161 Kb/s)
        Latency: 6 msec (4-8 msec ~7 msec)
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #6717 [hci1] 31.313685
> HCI Event: Number of Completed Packets (0x13) plen 5   #6718 [hci1] 31.321655
        Num handles: 1
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #6717: len 121 (138 Kb/s)
        Latency: 7 msec (4-8 msec ~7 msec)
... (no further Number of Completed Packets for Handle 2304) ...
```
The connection is not disconnected, but kernel stops queueing more data
for Handle 2304 as there are no more NCP events e.g. packets #6712,
#6713 never get acknowledged.

In kernel logs in such situation:
```
[49492.883693] hci_rx_work:4101: hci1
[49492.883701] hci_rx_work:4145: hci1 Event packet
[49492.883708] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 1
[49492.883711] hci_sched_sco:3627: hci1
[49492.883712] hci_sched_esco:3650: hci1
[49492.883713] hci_sched_iso:3844: hci1
[49492.883715] hci_low_sent:3448: conn 0000000095db7cfc quote 1
[49492.883717] hci_sched_iso:3853: skb 00000000cc60fe22 len 125
[49492.883720] hci_send_frame:3040: hci1 type 5 len 125
[49492.883728] hci_sched_acl:3766: hci1
[49492.883729] hci_sched_le:3793: hci1
[49492.883731] hci_chan_sent:3487: hci1
[49492.890059] hci_send_iso:3365: hci1 len 121
[49492.890063] hci_queue_iso:3335: hci1 nonfrag skb 0000000084004c42 len 125
[49492.890087] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 0
[49492.890090] hci_sched_sco:3627: hci1
[49492.890091] hci_sched_esco:3650: hci1
[49492.890091] hci_sched_iso:3844: hci1
[49492.890092] hci_sched_acl:3766: hci1
[49492.890093] hci_sched_le:3793: hci1
[49492.890094] hci_chan_sent:3487: hci1
[49492.890733] hci_rx_work:4101: hci1
[49492.890740] hci_rx_work:4145: hci1 Event packet
[49492.890747] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 1
[49492.890751] hci_sched_sco:3627: hci1
[49492.890753] hci_sched_esco:3650: hci1
[49492.890754] hci_sched_iso:3844: hci1
[49492.890756] hci_low_sent:3448: conn 0000000095db7cfc quote 1
[49492.890758] hci_sched_iso:3853: skb 000000002d6239ed len 125
[49492.890761] hci_send_frame:3040: hci1 type 5 len 125
[49492.890770] hci_sched_acl:3766: hci1
[49492.890771] hci_sched_le:3793: hci1
[49492.890772] hci_chan_sent:3487: hci1
```
So the kernel indeed thinks all ISO buffers are full, and only one of
the two connections is making progress.

When the application stops queuing more data and the last packet for
Handle 2305 is acknowledged, kernel starts queuing more packets for
Handle 2304 (due to hci_conn_low_sent scheduling logic). Each sent
packet now gets a NCP event immediately:
```
< ISO Data TX: Handle 2305 flags 0x02 dlen 121           #8105 [hci1] 36.518739
> HCI Event: Number of Completed Packets (0x13) plen 5   #8106 [hci1] 36.526665
        Num handles: 1
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #8105: len 121 (138 Kb/s)
        Latency: 7 msec (4-8 msec ~7 msec)
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #8107 [hci1] 36.526694
> HCI Event: Number of Completed Packets (0x13) plen 5   #8108 [hci1] 36.533663
        Num handles: 1
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #6698: len 121 (0 Kb/s)
        Latency: 5271 msec (5-5271 msec ~2162 msec)
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #8109 [hci1] 36.533697
> HCI Event: Number of Completed Packets (0x13) plen 5   #8110 [hci1] 36.541651
        Num handles: 1
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #6701: len 121 (0 Kb/s)
        Latency: 5272 msec (5-5272 msec ~3717 msec)
...
< ISO Data TX: Handle 2304 flags 0x02 dlen 121           #8117 [hci1] 36.563720
> HCI Event: Number of Completed Packets (0x13) plen 5   #8118 [hci1] 36.571665
        Num handles: 1
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #6712: len 121 (0 Kb/s)
        Latency: 5272 msec (5-5272 msec ~4300 msec)
```
Note huge latency values: btmon and kernel are thinking these are the
packets queued to controller buffers long ago.

That these "delayed" NCP events come immediately after kernel sends
a new ISO data packet for the handle, and that there still remain
unacknowledged packets at the end, it looks as if each of these events
actually corresponds to the packet sent immediately before, and the
controller has failed to send NCP events for the earlier packets.

Similarly, as the CIS playback remained in sync despite "latency"
increase, one can suspect some packets were silently dropped, and
the NCP flow control accounting is off by some number of packets.

***

Also, in another btmon log under similar conditions, but without trying
to disrupt the connection, at the end of the transmission we see the
last ISO Data TX events:
```
bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121               #6405 [hci1] 22.231296
bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121               #6406 [hci1] 22.231317
...
bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121               #6408 [hci1] 22.239306
bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121               #6409 [hci1] 22.239315
...
bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121               #6411 [hci1] 22.246330
bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121               #6412 [hci1] 22.246351
...
bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121               #6414 [hci1] 22.254364
bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121               #6415 [hci1] 22.254375
> HCI Event: Number of Completed Packets (0x13) plen 9                         #6416 [hci1] 22.261249
        Num handles: 2
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Count: 1
        #6405: len 121 (33 Kb/s)
        Latency: 29 msec (2-59 msec ~29 msec)
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Count: 1
        #6406: len 121 (33 Kb/s)
        Latency: 29 msec (2-44 msec ~29 msec)
bluetoothd[2231]: < ACL Data TX: Handle 2049 flags 0x00 dlen 10                #6417 [hci1] 23.226334
      ATT: Write Command (0x52) len 5
        Handle: 0x0043 Type: ASE Control Point (0x2bc6)
          Data[3]: 050101
            Opcode: Disable (0x05)
            Number of ASE(s): 1
            ASE: #0
            ASE ID: 0x01
...
< HCI Command: Disconnect (0x01|0x0006) plen 3                                 #6422 [hci1] 23.307036
        Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd)
        Reason: Remote User Terminated Connection (0x13)
...
< HCI Command: Disconnect (0x01|0x0006) plen 3                                 #6438 [hci1] 23.374932
        Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd)
        Reason: Remote User Terminated Connection (0x13)
```
At the end of streaming, NCP events remain missing for 3 packets for
both CIS, despite there is one second delay between stopping sending
data and starting disconnect procedures.

----------

Firmware version info:
```
[   27.584475] Bluetooth: hci1: Device revision is 0
[   27.584479] Bluetooth: hci1: Secure boot is enabled
[   27.584481] Bluetooth: hci1: OTP lock is enabled
[   27.584482] Bluetooth: hci1: API lock is enabled
[   27.584483] Bluetooth: hci1: Debug lock is disabled
[   27.584484] Bluetooth: hci1: Minimum firmware build 1 week 10 2014
[   27.584486] Bluetooth: hci1: Bootloader timestamp 2019.40 buildtype 1 build 38
[   27.584491] Bluetooth: hci1: No support for _PRR ACPI method
[   27.643226] Bluetooth: hci1: Found device firmware: intel/ibt-0041-0041.sfi
[   27.643280] Bluetooth: hci1: Boot Address: 0x100800
[   27.643282] Bluetooth: hci1: Firmware Version: 123-8.24
[   29.575500] Bluetooth: hci1: Waiting for firmware download to complete
[   29.575701] Bluetooth: hci1: Firmware loaded in 1887177 usecs
[   29.575747] Bluetooth: hci1: Waiting for device to boot
[   29.600715] Bluetooth: hci1: Device booted in 24401 usecs
[   29.600719] Bluetooth: hci1: Malformed MSFT vendor event: 0x02
[   29.601167] Bluetooth: hci1: Found Intel DDC parameters: intel/ibt-0041-0041.ddc
[   29.602729] Bluetooth: hci1: Applying Intel DDC parameters completed
[   29.605729] Bluetooth: hci1: Firmware timestamp 2024.8 buildtype 1 build 79483
[   29.605737] Bluetooth: hci1: Firmware SHA1: 0x7c7b67b9
```

-- 
Pauli Virtanen





[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux