Greetings, For bug reproduction just type: # rmmod mt7921e Backtrace: BUG: KASAN: use-after-free in tasklet_action_common.isra.0+0x6a4/0x7a0 Read of size 8 at addr ffff888146806748 by task ksoftirqd/5/48 CPU: 5 PID: 48 Comm: ksoftirqd/5 Tainted: G W L ------- --- 6.8.0-0.rc0.20240109git9f8413c4a66f.1.fc40.x86_64+debug #1 Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 Call Trace: <TASK> dump_stack_lvl+0x76/0xd0 print_report+0xcf/0x670 ? tasklet_action_common.isra.0+0x6a4/0x7a0 kasan_report+0xa6/0xe0 ? tasklet_action_common.isra.0+0x6a4/0x7a0 tasklet_action_common.isra.0+0x6a4/0x7a0 __do_softirq+0x215/0x8b9 ? __pfx___do_softirq+0x10/0x10 ? run_ksoftirqd+0x73/0x80 ? __pfx_run_ksoftirqd+0x10/0x10 run_ksoftirqd+0x4b/0x80 smpboot_thread_fn+0x56d/0x900 ? __kthread_parkme+0xbd/0x1f0 ? __pfx_smpboot_thread_fn+0x10/0x10 kthread+0x2f2/0x3d0 ? _raw_spin_unlock_irq+0x28/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> The buggy address belongs to the physical page: page:0000000021f6fa86 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x146806 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) page_type: 0xffffffff() raw: 0017ffffc0000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888146806600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888146806680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff888146806700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff888146806780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888146806800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Demonstration: https://youtu.be/4dSuQp0aPkQ Probably I wouldn't have paid attention to this because in real life I did not need to unload module mt7921e. But after commit 9270270d62191b7549296721e8d5f3dc0df01563 I see "use-after-free" on every system shutdown and reboot. mikhail@secondary-ws ~/p/g/linux ((fcc51acf)|BISECTING)> git bisect good 9270270d62191b7549296721e8d5f3dc0df01563 is the first bad commit commit 9270270d62191b7549296721e8d5f3dc0df01563 Author: Deren Wu <deren.wu@xxxxxxxxxxxx> Date: Tue Feb 14 10:49:57 2023 +0800 wifi: mt76: mt7921: fix PCI DMA hang after reboot mt7921 just stop some workers and clean up chip status before reboot. In stress test, there are working activities still running at the period of .shutdown callback and that would cause some hosts cannot recover DMA after reboot. To avoid the floating state in reboot, we use mt7921_pci_remove() to fully deinit all resources. Fixes: f23a0cea8bd6 ("wifi: mt76: mt7921e: add pci .shutdown() support") Signed-off-by: Deren Wu <deren.wu@xxxxxxxxxxxx> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> Signed-off-by: Felix Fietkau <nbd@xxxxxxxx> drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) Most oldest kernel which I could build is 5.17 and on this kernel use-after-free has different backtrace: BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e] Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115 CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10 Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x1f/0x190 ? mt7921_irq_handler+0xd8/0x100 [mt7921e] ? mt7921_irq_handler+0xd8/0x100 [mt7921e] kasan_report.cold+0x7f/0x11b ? mt7921_irq_handler+0xd8/0x100 [mt7921e] mt7921_irq_handler+0xd8/0x100 [mt7921e] free_irq+0x627/0xaa0 devm_free_irq+0x94/0xd0 ? devm_request_any_context_irq+0x160/0x160 ? kobject_put+0x18d/0x4a0 mt7921_pci_remove+0x153/0x190 [mt7921e] pci_device_remove+0xa2/0x1d0 __device_release_driver+0x346/0x6e0 driver_detach+0x1ef/0x2c0 bus_remove_driver+0xe7/0x2d0 ? __check_object_size+0x57/0x310 pci_unregister_driver+0x26/0x250 __do_sys_delete_module+0x307/0x510 ? free_module+0x6a0/0x6a0 ? fpregs_assert_state_consistent+0x4b/0xb0 ? rcu_read_lock_sched_held+0x10/0x70 ? syscall_enter_from_user_mode+0x20/0x70 ? trace_hardirqs_on+0x1c/0x130 do_syscall_64+0x5c/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 ? do_syscall_64+0x68/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc83aad105b Code: 73 01 c3 48 8b 0d bd 8d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8d 8d 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc384c28c8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000560eec64a750 RCX: 00007fc83aad105b RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000560eec64a7b8 RBP: 00007ffc384c28f0 R08: 1999999999999999 R09: 0000000000000000 R10: 00007fc83ab49ac0 R11: 0000000000000206 R12: 0000000000000000 R13: 00007ffc384c2b60 R14: 0000560eec64a750 R15: 0000000000000000 </TASK> The buggy address belongs to the page: page:00000000f94118a1 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x24a7d3 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) raw: 0017ffffc0000000 0000000000000000 ffffea000929f488 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88824a7d3a00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88824a7d3a80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff88824a7d3b00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88824a7d3b80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88824a7d3c00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff All kernel logs and .config are attached to this message. What did you think? -- Best Regards, Mike Gavrilov.
Attachment:
dmesg-6.8.zip
Description: Zip archive
Attachment:
dmesg-5.17.0.zip
Description: Zip archive
Attachment:
.config.zip
Description: Zip archive
Attachment:
build-error-5.16.zip
Description: Zip archive