Hi, The behavior of HCI ISO Data Packet flow control on Intel AX210 seems to have changed between firmware versions 23.20.0.3 (fw build REL75234, Jan 2024) and 23.40.0.2 (fw build REL79483, Mar 2024). In practice, with this firmware version TWS playback with two CIS essentially always loses one of the two streams, as kernel stops sending further data to the controller for one of the two. This does not occur with the earlier firmware version. The new behavior looks as if the controller is failing to send Number of Completed Packets events for some of the ISO data packets it actually has processed (which would seem wrong, Core v5.4 Vol 4 Part E §4.1.1), causing the kernel to stop queueing further data. btsnoop log when streaming data to two CIS: https://gitlab.freedesktop.org/pvir/repros/-/raw/main/2024-04-22_1.log?inline=false *** Looking at `btmon -r 2024-04-22_1.log` output: Initially it looks fine: ``` < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #1283 [hci1] 17.716197 < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #1284 [hci1] 17.716214 > HCI Event: Number of Completed Packets (0x13) plen 9 #1285 [hci1] 17.716529 Num handles: 2 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #1280: len 121 (138 Kb/s) Latency: 7 msec (5-8 msec ~7 msec) Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #1281: len 121 (138 Kb/s) Latency: 7 msec (5-8 msec ~7 msec) ``` After physically covering one of the receiver devices briefly, which appears to temporarily reduce the connection quality: ``` < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #1430 [hci1] 18.083668 < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #1431 [hci1] 18.083684 > HCI Event: Number of Completed Packets (0x13) plen 9 #1432 [hci1] 18.083660 Num handles: 2 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #1425: len 121 (69 Kb/s) Latency: 14 msec (5-15 msec ~15 msec) Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #1427: len 121 (138 Kb/s) Latency: 7 msec (5-8 msec ~7 msec) ``` The latencies stay constant after this. In this situation when the two CIS have different latency (as seen in btmon), in the earlier firmware the CIS playback was desynchronized (cf https://github.com/bluez/bluez/issues/515), but in this instance they remain synchronized despite the latency difference in the NCP events. Note however that CIS desynchronization also still occurs with this firmware revision, but we can no longer compensate for it on application side, as the NCP events no longer correspond to this. Disrupting connectivity again increases the NCP latency of handle 2304 further to 30ms (around #2630), and yet again (around #4487) to 50ms, with the CIS playback remaining still in sync: ``` < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #2639 [hci1] 21.106203 < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #2640 [hci1] 21.106416 > HCI Event: Number of Completed Packets (0x13) plen 9 #2641 [hci1] 21.106660 Num handles: 2 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #2628: len 121 (32 Kb/s) Latency: 30 msec (5-30 msec ~30 msec) Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #2636: len 121 (138 Kb/s) Latency: 7 msec (5-8 msec ~7 msec) ``` A fourth time results to packets for Handle 2304 no longer being reported complete, and kernel stops scheduling more as it thinks there are already several packets waiting in controller buffers: ``` < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6712 [hci1] 31.298684 < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6713 [hci1] 31.298693 > HCI Event: Number of Completed Packets (0x13) plen 5 #6714 [hci1] 31.306659 Num handles: 1 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #6695: len 121 (18 Kb/s) Latency: 52 msec (5-74 msec ~52 msec) < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6715 [hci1] 31.306713 > HCI Event: Number of Completed Packets (0x13) plen 5 #6716 [hci1] 31.313659 Num handles: 1 Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #6715: len 121 (161 Kb/s) Latency: 6 msec (4-8 msec ~7 msec) < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6717 [hci1] 31.313685 > HCI Event: Number of Completed Packets (0x13) plen 5 #6718 [hci1] 31.321655 Num handles: 1 Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #6717: len 121 (138 Kb/s) Latency: 7 msec (4-8 msec ~7 msec) ... (no further Number of Completed Packets for Handle 2304) ... ``` The connection is not disconnected, but kernel stops queueing more data for Handle 2304 as there are no more NCP events e.g. packets #6712, #6713 never get acknowledged. In kernel logs in such situation: ``` [49492.883693] hci_rx_work:4101: hci1 [49492.883701] hci_rx_work:4145: hci1 Event packet [49492.883708] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 1 [49492.883711] hci_sched_sco:3627: hci1 [49492.883712] hci_sched_esco:3650: hci1 [49492.883713] hci_sched_iso:3844: hci1 [49492.883715] hci_low_sent:3448: conn 0000000095db7cfc quote 1 [49492.883717] hci_sched_iso:3853: skb 00000000cc60fe22 len 125 [49492.883720] hci_send_frame:3040: hci1 type 5 len 125 [49492.883728] hci_sched_acl:3766: hci1 [49492.883729] hci_sched_le:3793: hci1 [49492.883731] hci_chan_sent:3487: hci1 [49492.890059] hci_send_iso:3365: hci1 len 121 [49492.890063] hci_queue_iso:3335: hci1 nonfrag skb 0000000084004c42 len 125 [49492.890087] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 0 [49492.890090] hci_sched_sco:3627: hci1 [49492.890091] hci_sched_esco:3650: hci1 [49492.890091] hci_sched_iso:3844: hci1 [49492.890092] hci_sched_acl:3766: hci1 [49492.890093] hci_sched_le:3793: hci1 [49492.890094] hci_chan_sent:3487: hci1 [49492.890733] hci_rx_work:4101: hci1 [49492.890740] hci_rx_work:4145: hci1 Event packet [49492.890747] hci_tx_work:3869: hci1 acl 4 sco 6 le 6 iso 1 [49492.890751] hci_sched_sco:3627: hci1 [49492.890753] hci_sched_esco:3650: hci1 [49492.890754] hci_sched_iso:3844: hci1 [49492.890756] hci_low_sent:3448: conn 0000000095db7cfc quote 1 [49492.890758] hci_sched_iso:3853: skb 000000002d6239ed len 125 [49492.890761] hci_send_frame:3040: hci1 type 5 len 125 [49492.890770] hci_sched_acl:3766: hci1 [49492.890771] hci_sched_le:3793: hci1 [49492.890772] hci_chan_sent:3487: hci1 ``` So the kernel indeed thinks all ISO buffers are full, and only one of the two connections is making progress. When the application stops queuing more data and the last packet for Handle 2305 is acknowledged, kernel starts queuing more packets for Handle 2304 (due to hci_conn_low_sent scheduling logic). Each sent packet now gets a NCP event immediately: ``` < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #8105 [hci1] 36.518739 > HCI Event: Number of Completed Packets (0x13) plen 5 #8106 [hci1] 36.526665 Num handles: 1 Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #8105: len 121 (138 Kb/s) Latency: 7 msec (4-8 msec ~7 msec) < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #8107 [hci1] 36.526694 > HCI Event: Number of Completed Packets (0x13) plen 5 #8108 [hci1] 36.533663 Num handles: 1 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #6698: len 121 (0 Kb/s) Latency: 5271 msec (5-5271 msec ~2162 msec) < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #8109 [hci1] 36.533697 > HCI Event: Number of Completed Packets (0x13) plen 5 #8110 [hci1] 36.541651 Num handles: 1 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #6701: len 121 (0 Kb/s) Latency: 5272 msec (5-5272 msec ~3717 msec) ... < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #8117 [hci1] 36.563720 > HCI Event: Number of Completed Packets (0x13) plen 5 #8118 [hci1] 36.571665 Num handles: 1 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #6712: len 121 (0 Kb/s) Latency: 5272 msec (5-5272 msec ~4300 msec) ``` Note huge latency values: btmon and kernel are thinking these are the packets queued to controller buffers long ago. That these "delayed" NCP events come immediately after kernel sends a new ISO data packet for the handle, and that there still remain unacknowledged packets at the end, it looks as if each of these events actually corresponds to the packet sent immediately before, and the controller has failed to send NCP events for the earlier packets. Similarly, as the CIS playback remained in sync despite "latency" increase, one can suspect some packets were silently dropped, and the NCP flow control accounting is off by some number of packets. *** Also, in another btmon log under similar conditions, but without trying to disrupt the connection, at the end of the transmission we see the last ISO Data TX events: ``` bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6405 [hci1] 22.231296 bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6406 [hci1] 22.231317 ... bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6408 [hci1] 22.239306 bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6409 [hci1] 22.239315 ... bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6411 [hci1] 22.246330 bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6412 [hci1] 22.246351 ... bluetoothd[2231]: < ISO Data TX: Handle 2304 flags 0x02 dlen 121 #6414 [hci1] 22.254364 bluetoothd[2231]: < ISO Data TX: Handle 2305 flags 0x02 dlen 121 #6415 [hci1] 22.254375 > HCI Event: Number of Completed Packets (0x13) plen 9 #6416 [hci1] 22.261249 Num handles: 2 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Count: 1 #6405: len 121 (33 Kb/s) Latency: 29 msec (2-59 msec ~29 msec) Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Count: 1 #6406: len 121 (33 Kb/s) Latency: 29 msec (2-44 msec ~29 msec) bluetoothd[2231]: < ACL Data TX: Handle 2049 flags 0x00 dlen 10 #6417 [hci1] 23.226334 ATT: Write Command (0x52) len 5 Handle: 0x0043 Type: ASE Control Point (0x2bc6) Data[3]: 050101 Opcode: Disable (0x05) Number of ASE(s): 1 ASE: #0 ASE ID: 0x01 ... < HCI Command: Disconnect (0x01|0x0006) plen 3 #6422 [hci1] 23.307036 Handle: 2304 Address: 28:3D:C2:4A:7E:DA (Samsung Electronics Co.,Ltd) Reason: Remote User Terminated Connection (0x13) ... < HCI Command: Disconnect (0x01|0x0006) plen 3 #6438 [hci1] 23.374932 Handle: 2305 Address: 28:3D:C2:4A:7D:2A (Samsung Electronics Co.,Ltd) Reason: Remote User Terminated Connection (0x13) ``` At the end of streaming, NCP events remain missing for 3 packets for both CIS, despite there is one second delay between stopping sending data and starting disconnect procedures. ---------- Firmware version info: ``` [ 27.584475] Bluetooth: hci1: Device revision is 0 [ 27.584479] Bluetooth: hci1: Secure boot is enabled [ 27.584481] Bluetooth: hci1: OTP lock is enabled [ 27.584482] Bluetooth: hci1: API lock is enabled [ 27.584483] Bluetooth: hci1: Debug lock is disabled [ 27.584484] Bluetooth: hci1: Minimum firmware build 1 week 10 2014 [ 27.584486] Bluetooth: hci1: Bootloader timestamp 2019.40 buildtype 1 build 38 [ 27.584491] Bluetooth: hci1: No support for _PRR ACPI method [ 27.643226] Bluetooth: hci1: Found device firmware: intel/ibt-0041-0041.sfi [ 27.643280] Bluetooth: hci1: Boot Address: 0x100800 [ 27.643282] Bluetooth: hci1: Firmware Version: 123-8.24 [ 29.575500] Bluetooth: hci1: Waiting for firmware download to complete [ 29.575701] Bluetooth: hci1: Firmware loaded in 1887177 usecs [ 29.575747] Bluetooth: hci1: Waiting for device to boot [ 29.600715] Bluetooth: hci1: Device booted in 24401 usecs [ 29.600719] Bluetooth: hci1: Malformed MSFT vendor event: 0x02 [ 29.601167] Bluetooth: hci1: Found Intel DDC parameters: intel/ibt-0041-0041.ddc [ 29.602729] Bluetooth: hci1: Applying Intel DDC parameters completed [ 29.605729] Bluetooth: hci1: Firmware timestamp 2024.8 buildtype 1 build 79483 [ 29.605737] Bluetooth: hci1: Firmware SHA1: 0x7c7b67b9 ``` -- Pauli Virtanen