Search Linux Wireless

Re: [PATCH v3]wifi: ath12k: Add firmware coredump collection support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/26/2024 8:39 PM, Kalle Valo wrote:
> Kalle Valo <kvalo@xxxxxxxxxx> writes:
> 
>> Sowmiya Sree Elavalagan <quic_ssreeela@xxxxxxxxxxx> wrote:
>>
>>> In case of firmware assert snapshot of firmware memory is essential for
>>> debugging. Add firmware coredump collection support for PCI bus.
>>> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
>>> format and also pack various memory shared during QMI phase in separate
>>> TLVs.  Add necessary header and share the dumps to user space using dev
>>> coredump framework. Coredump collection is disabled by default and can
>>> be enabled using menuconfig. Dump collected for a radio is 55 MB
>>> approximately.
>>>
>>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
>>>
>>> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@xxxxxxxxxxx>
>>> Acked-by: Jeff Johnson <quic_jjohnson@xxxxxxxxxxx>
>>> Signed-off-by: Kalle Valo <quic_kvalo@xxxxxxxxxxx>
>>
>> This didn't compile for me, I added this to pci.c:
>>
>> +#include <linux/vmalloc.h>
>>
>> Also in the pending branch I made some whitespace in struct ath12k_dump_file_data:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=44ae07628b68375f476895f4fc1e89a570790ac0
>>
>> Any tips how to test this until we have the debugfs interface to crash the firmware?
> 
> I was able to get patch 'wifi: ath12k: Add support to simulate firmware
> crash' (not yet public) and did a quick test with it. There seems to be
> a KASAN warning but I can't debug this further at this time.
> 
> [ 8091.304272] ath12k_pci 0000:06:00.0: simulating firmware assert crash
> [ 8091.722245] ==================================================================
> [ 8091.722329] BUG: KASAN: vmalloc-out-of-bounds in ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.722433] Write of size 4 at addr ffffc9000644b28c by task kworker/u32:0/11
> [ 8091.722517] 
> [ 8091.722552] CPU: 0 PID: 11 Comm: kworker/u32:0 Not tainted 6.10.0-rc4-wt-ath+ #1663
> [ 8091.722604] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
> [ 8091.722670] Workqueue: ath12k_aux_wq ath12k_core_reset [ath12k]
> [ 8091.722742] Call Trace:
> [ 8091.722778]  <TASK>
> [ 8091.722832]  dump_stack_lvl+0x7d/0xe0
> [ 8091.722920]  print_address_description.constprop.0+0x33/0x3a0
> [ 8091.722999]  print_report+0xb5/0x260
> [ 8091.723069]  ? kasan_addr_to_slab+0xd/0x80
> [ 8091.723146]  kasan_report+0xd8/0x110
> [ 8091.723217]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723301]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723386]  __asan_report_store_n_noabort+0x12/0x20
> [ 8091.723461]  ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723563]  ? ath12k_pci_coredump_calculate_size+0x730/0x730 [ath12k]
> [ 8091.723632]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.723677]  ath12k_coredump_collect+0x60/0x73 [ath12k]
> [ 8091.724276]  ath12k_core_reset+0x1b1/0x880 [ath12k]
> [ 8091.724921]  ? _raw_spin_unlock_irq+0x22/0x50
> [ 8091.725503]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.726126]  process_one_work+0x8d7/0x19f0
> [ 8091.726718]  ? pwq_dec_nr_in_flight+0x580/0x580
> [ 8091.727346]  ? move_linked_works+0x128/0x2c0
> [ 8091.727998]  ? assign_work+0x15e/0x270
> [ 8091.728601]  worker_thread+0x715/0x1270
> [ 8091.729244]  ? rescuer_thread+0xdb0/0xdb0
> [ 8091.729905]  kthread+0x2fa/0x3f0
> [ 8091.730520]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.731192]  ret_from_fork+0x31/0x70
> [ 8091.731856]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.732525]  ret_from_fork_asm+0x11/0x20
> [ 8091.733212]  </TASK>
> [ 8091.733909] 
> [ 8091.734559] The buggy address belongs to the virtual mapping at#012[ 8091.734559]  [ffffc9000500b000, ffffc9000644d000) created by:#012[ 8091.734559]  ath12k_pci_coredump_download+0x147/0x1330 [ath12k]
> [ 8091.736558] 
> [ 8091.737272] The buggy address belongs to the physical page:
> [ 8091.738016] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x15a485
> [ 8091.738730] flags: 0x200000000000000(node=0|zone=2)
> [ 8091.739481] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
> [ 8091.740256] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> [ 8091.741043] page dumped because: kasan: bad access detected
> [ 8091.741786] 
> [ 8091.742529] Memory state around the buggy address:
> [ 8091.743296]  ffffc9000644b180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744087]  ffffc9000644b200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744834] >ffffc9000644b280: 00 04 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.745598]                       ^
> [ 8091.746359]  ffffc9000644b300: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747152]  ffffc9000644b380: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747932] ==================================================================
> [ 8091.748688] Disabling lock debugging due to kernel taint
> [ 8091.749699] ath12k_pci 0000:06:00.0: Uploading coredump
> 

Hi Kalle,

Can you please share the configs to reproduce the above KASAN issue in my local setup. I tried with below KASAN configs enabled, but I couldn't reproduce the above warning in my x86 setup.

CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
CONFIG_KASAN_EXTRA_INFO=y

Thanks,
Sowmiya Sree





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux