Search Linux Wireless

Re: ath11k: WCN6855: possible ring buffer corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jeff,

The ath11k ring-buffer corruption issue is hurting some users of the
Lenovo ThinkPad X13s quite bad so I promised to try to escalate this
with you and Qualcomm.

The chance of hitting the bug seems to depend on the AP/network, and it
also seems my hypothesis that enabling the GIC ITS, which increases
parallelism by spreading interrupt handling over all cores, do indeed
make it easier to hit this.

The latter could indicate a driver bug, even this could very well be a
firmware issue.

Have you had a chance to look into this yet? Can you tell from the logs
and reported symptoms whether this is a firmware bug or not?

On Tue, Apr 16, 2024 at 05:40:43PM +0200, Johan Hovold wrote:

> Over the past year I've received occasional reports from users of the
> Lenovo ThinkPad X13s (aarch64) that the wifi sometimes stops working.
> When this happens the kernel log is filled with errors like:
> 
> [ 1164.962227] ath11k_warn: 222 callbacks suppressed
> [ 1164.962238] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492
> [ 1164.962309] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1484
> [ 1164.962994] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1476, expected 1484
> [ 1164.963405] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1488
> [ 1164.963701] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1480, expected 1484
> [ 1164.963852] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1468, expected 1480
> [ 1164.964491] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1484, expected 1492
> [ 1164.964733] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1488, expected 1492
> [ 1165.198329] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1488
> [ 1165.198470] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1460, expected 1476
> [ 1166.266513] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 2699 at byte 348 (1132 bytes left, 64788 expected)
> [ 1166.542803] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 4270 at byte 348 (1128 bytes left, 63772 expected)
> [ 1166.768238] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 0 at byte 376 (1112 bytes left, 11730 expected)
> [ 1166.900152] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 3 at byte 790 (694 bytes left, 16256 expected)
> [ 1168.499073] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 1 at byte 62 (1426 bytes left, 3089 expected)
> [ 1168.818086] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 63063 at byte 1466 (10 bytes left, 50467 expected)
> [ 1169.032885] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 0 at byte 364 (1120 bytes left, 12483 expected)
> [ 1169.308546] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 3092 at byte 348 (1128 bytes left, 64780 expected)
> [ 1169.563928] ath11k_pci 0006:01:00.0: wmi tlv parse failure of tag 1 at byte 348 (1124 bytes left, 44062 expected)
> 
> which after a quick look at the driver seems to suggest that we may be
> hitting some kind of ring buffer corruption.
> 
> Rebinding the driver supposedly sometimes make things work again, but
> not always.
> 
> The issue has been confirmed with the 6.8 kernel and the latest firmware
> WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37.
> 
> I've triggered this issue twice myself with 6.6 and .23 firmware, but
> the reports date back to at least 6.2 and likely when using even older
> firmware.
> 
> An unconfirmed hypothesis is that we may be hitting this more often when
> enabling the GIC ITS so that the interrupt processing is spread out over
> all cores (unlike when using the DWC controller's internal MSI
> implementation). This change is now merged for 6.10.
> 
> Do you have any immediate theories about what could be causing this?
> Does it look like a firmware or driver issue to you, for example? Is it
> something you've seen before?
> 
> Note that I've previously reported this here:
> 
> 	https://bugzilla.kernel.org/show_bug.cgi?id=218623
 
Johan




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux