Hi,
Kernel version: 6.5.13 #1 SMP PREEMPT_DYNAMIC
BlueZ version: 5.64
source code I add log: data-buffer-overflow.zip
syslog snippet: kernel-log-report.txt
I have also posted this issue on https://github.com/bluez/bluez/issues/915.
This issue is only reproduced in multi-link setup: There is a BlueZ Central device is connected to other two Peripheral devices.
These three devices have high ACL data traffics.
We have observed that data buffer overflow occurred in a strange manner. Because we are the chip vendor, we can easily debug inside the controller.
All evidence shows it was not the controller's bug.
Finally, we add debug log in the kernel.
The debug log confirms what we captured using HCI sniffer. That is, BlueZ sends ACL data overflowed the Tx buffer number of the controller, in our case the buffer number is 12.
Additionally, a code review of hci_sched_le() raises some doubts that may have caused this issue.
- "hdev->le_cnt" is the variable used to record current "credit" that Host can send HCI frames to the controller, it is consumed here but granted in hci_num_comp_pkts_evt(). It is unsigned int, in front of this function, unsigned int variable "le_cnt" assigned to int variable "cnt", and the following calculation the "cnt" was used. This is a risky assignment.
- The inner while loop does not check the value of "cnt" variable. As a result, it runs more times than the limited number and causes "cnt" to underflow.
- When "cnt" underflow, the outer while loop runs more times than expected.
I have added the following test patch to verify the root cause I guessed. It is proven that it works.
while (cnt > 0 && (chan = hci_chan_sent(hdev, LE_LINK, "e))) {
u32 priority = (skb_peek(&chan->data_q))->priority;
while (cnt > 0 && quote-- && (skb = skb_peek(&chan->data_q))) {
Please kindly have a look at this issue, thanks.
Best Regards,
James
Aug 2 16:26:56 james-vm3 kernel: [ 143.031280] hci0 acl 0 sco 0 le 3 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.031283] chan 0000000078d3f816 quote 3 Aug 2 16:26:56 james-vm3 kernel: [ 143.031291] hci_sched_le(cnt 2 quote 2)[A 614,P 604] Aug 2 16:26:56 james-vm3 kernel: [ 143.037484] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.037505] chan 00000000493dd831 quote 2 Aug 2 16:26:56 james-vm3 kernel: [ 143.037524] hci_sched_le(cnt 1 quote 1)[A 615,P 604] Aug 2 16:26:56 james-vm3 kernel: [ 143.042290] hci0 acl 0 sco 0 le 1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.042296] chan 0000000078d3f816 quote 1 Aug 2 16:26:56 james-vm3 kernel: [ 143.042309] hci_sched_le(cnt 0 quote 0)[A 616,P 604] Aug 2 16:26:56 james-vm3 kernel: [ 143.049360] num_pkt=1,[A616,P605] Aug 2 16:26:56 james-vm3 kernel: [ 143.049383] hci0 acl 0 sco 0 le 1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.049535] hci0 acl 0 sco 0 le 1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.049538] chan 00000000493dd831 quote 1 Aug 2 16:26:56 james-vm3 kernel: [ 143.049548] hci_sched_le(cnt 0 quote 0)[A 617,P 605] Aug 2 16:26:56 james-vm3 kernel: [ 143.050927] num_pkt=1,[A617,P606] Aug 2 16:26:56 james-vm3 kernel: [ 143.050933] hci0 acl 0 sco 0 le 1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.050941] num_pkt=1,[A617,P607] Aug 2 16:26:56 james-vm3 kernel: [ 143.050943] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.053544] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.053548] chan 0000000078d3f816 quote 2 Aug 2 16:26:56 james-vm3 kernel: [ 143.053571] hci_sched_le(cnt 1 quote 1)[A 618,P 607] Aug 2 16:26:56 james-vm3 kernel: [ 143.055131] num_pkt=1,[A618,P608] Aug 2 16:26:56 james-vm3 kernel: [ 143.055135] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.059311] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.059316] chan 00000000493dd831 quote 2 Aug 2 16:26:56 james-vm3 kernel: [ 143.059358] hci_sched_le(cnt 1 quote 1)[A 619,P 608] Aug 2 16:26:56 james-vm3 kernel: [ 143.061031] hci0 acl 0 sco 0 le 1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.061036] chan 00000000493dd831 quote 1 Aug 2 16:26:56 james-vm3 kernel: [ 143.061061] hci_sched_le(cnt 0 quote 0)[A 620,P 608] Aug 2 16:26:56 james-vm3 kernel: [ 143.064748] hci0 acl 0 sco 0 le 0 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.073069] hci0 acl 0 sco 0 le 0 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.075569] hci0 acl 0 sco 0 le 0 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.084197] num_pkt=1,[A620,P609] Aug 2 16:26:56 james-vm3 kernel: [ 143.084218] num_pkt=1,[A620,P610] Aug 2 16:26:56 james-vm3 kernel: [ 143.084219] hci0 acl 0 sco 0 le 2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.084221] chan 00000000493dd831 quote 1 Aug 2 16:26:56 james-vm3 kernel: [ 143.084227] hci_sched_le(cnt 1 quote 0)[A 621,P 610] Aug 2 16:26:56 james-vm3 kernel: [ 143.084228] chan 0000000078d3f816 quote 2 Aug 2 16:26:56 james-vm3 kernel: [ 143.084229] hci_sched_le(cnt 0 quote 1)[A 622,P 610] Aug 2 16:26:56 james-vm3 kernel: [ 143.084230] hci_sched_le(cnt -1 quote 0)[A 623,P 610] Aug 2 16:26:56 james-vm3 kernel: [ 143.084685] hci0 acl 0 sco 0 le -1 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.084689] chan 00000000493dd831 quote -1 Aug 2 16:26:56 james-vm3 kernel: [ 143.084717] hci_sched_le(cnt -2 quote -2)[A 624,P 610] Aug 2 16:26:56 james-vm3 kernel: [ 143.086506] hci0 acl 0 sco 0 le -2 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.086512] chan 0000000078d3f816 quote -2 Aug 2 16:26:56 james-vm3 kernel: [ 143.086524] hci_sched_le(cnt -3 quote -3)[A 625,P 610] Aug 2 16:26:56 james-vm3 kernel: [ 143.091071] num_pkt=1,[A625,P611] Aug 2 16:26:56 james-vm3 kernel: [ 143.091093] hci0 acl 0 sco 0 le 12 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.096148] hci0 acl 0 sco 0 le 12 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.096152] chan 00000000493dd831 quote 12 Aug 2 16:26:56 james-vm3 kernel: [ 143.096170] hci_sched_le(cnt 11 quote 11)[A 626,P 611] Aug 2 16:26:56 james-vm3 kernel: [ 143.097906] hci0 acl 0 sco 0 le 11 iso 0 Aug 2 16:26:56 james-vm3 kernel: [ 143.097916] chan 0000000078d3f816 quote 11 Aug 2 16:26:56 james-vm3 kernel: [ 143.097955] hci_sched_le(cnt 10 quote 10)[A 627,P 611]
<<attachment: data-buffer-overflow.zip>>