kernel bug found and root cause analysis

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, I found a bug titled "    KASAN: use-after-free Read in z3fold_zpool_free " with modified syzkaller in the lasted upstream related to   Z3FOLD.
If you fix this issue, please add the following tag to the commit:  Reported-by: Jianzhou Zhao <xnxc22xnxc22@xxxxxx>,    xingwei lee <xrivendell7@xxxxxxxxx>, Zhizhuo Tang <strforexctzzchange@xxxxxxxxxxx>

------------[ cut here ]-----------------------------------------
 KASAN: use-after-free Read in z3fold_zpool_free 
==================================================================
==================================================================
BUG: KASAN: use-after-free in lock_release+0x66b/0x6f0 kernel/locking/lockdep.c:5864
Read of size 8 at addr ffff888000917028 by task syz-executor/12358

CPU: 0 UID: 0 PID: 12358 Comm: syz-executor Not tainted 6.14.0-rc5-dirty #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <task>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xc1/0x630 mm/kasan/report.c:521
 kasan_report+0xbd/0xf0 mm/kasan/report.c:634
 lock_release+0x66b/0x6f0 kernel/locking/lockdep.c:5864
 __raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline]
 _raw_spin_unlock+0x16/0x50 kernel/locking/spinlock.c:186
 spin_unlock include/linux/spinlock.h:391 [inline]
 z3fold_page_unlock mm/z3fold.c:235 [inline]
 get_z3fold_header mm/z3fold.c:260 [inline]
 get_z3fold_header mm/z3fold.c:239 [inline]
 z3fold_free mm/z3fold.c:1100 [inline]
 z3fold_zpool_free+0x6f/0xe30 mm/z3fold.c:1392
 zswap_entry_free+0x234/0x540 mm/zswap.c:806
 zswap_invalidate+0x122/0x190 mm/zswap.c:1682
 swap_range_free mm/swapfile.c:1133 [inline]
 swap_entry_range_free+0x2dc/0x800 mm/swapfile.c:1512
 __swap_entries_free mm/swapfile.c:1470 [inline]
 free_swap_and_cache_nr+0x82c/0x910 mm/swapfile.c:1797
 zap_nonpresent_ptes mm/memory.c:1636 [inline]
 do_zap_pte_range mm/memory.c:1702 [inline]
 zap_pte_range mm/memory.c:1742 [inline]
 zap_pmd_range mm/memory.c:1834 [inline]
 zap_pud_range mm/memory.c:1863 [inline]
 zap_p4d_range mm/memory.c:1884 [inline]
 unmap_page_range+0x13f4/0x4270 mm/memory.c:1905
 unmap_single_vma+0x19a/0x2b0 mm/memory.c:1951
 unmap_vmas+0x1f2/0x440 mm/memory.c:1995
 exit_mmap+0x1b4/0xbc0 mm/mmap.c:1284
 __mmput+0x128/0x400 kernel/fork.c:1356
 mmput+0x60/0x70 kernel/fork.c:1378
 exit_mm kernel/exit.c:570 [inline]
 do_exit+0x9ae/0x2d00 kernel/exit.c:925
 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
 get_signal+0x2278/0x2540 kernel/signal.c:3036
 arch_do_signal_or_restart+0x81/0x7d0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
 do_syscall_64+0xd8/0x250 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f58f1babbb6
Code: Unable to access opcode bytes at 0x7f58f1babb8c.
RSP: 002b:00007ffc0348a6c0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: 000000000000002c RBX: 00007f58f28f4620 RCX: 00007f58f1babbb6
RDX: 000000000000002c RSI: 00007f58f28f4670 RDI: 0000000000000003
RBP: 0000000000000001 R08: 00007ffc0348a71c R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 00007f58f28f4670 R14: 0000000000000003 R15: 0000000000000000
 </task>

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x5e1 pfn:0x917
flags: 0x7ff00000000000(node=0|zone=0|lastcpupid=0x7ff)
raw: 007ff00000000000 dead000000000100 dead000000000122 0000000000000000
raw: 00000000000005e1 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as freed
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12800(GFP_NOWAIT|__GFP_NORETRY), pid 13319, tgid 13319 (syz-executor), ts 479686138015, free_ts 507033168720
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x193/0x1c0 mm/page_alloc.c:1551
 prep_new_page mm/page_alloc.c:1559 [inline]
 get_page_from_freelist+0xe4e/0x2b20 mm/page_alloc.c:3477
 __alloc_pages_slowpath mm/page_alloc.c:4288 [inline]
 __alloc_frozen_pages_noprof+0x6ce/0x21f0 mm/page_alloc.c:4752
 alloc_pages_mpol+0x1f2/0x540 mm/mempolicy.c:2270
 alloc_frozen_pages_noprof mm/mempolicy.c:2341 [inline]
 alloc_pages_noprof+0x12d/0x390 mm/mempolicy.c:2361
 z3fold_alloc mm/z3fold.c:1036 [inline]
 z3fold_zpool_malloc+0x836/0x1500 mm/z3fold.c:1388
 zswap_compress mm/zswap.c:971 [inline]
 zswap_store_page mm/zswap.c:1462 [inline]
 zswap_store+0xb27/0x25c0 mm/zswap.c:1571
 swap_writepage+0x3a8/0xe50 mm/page_io.c:278
 pageout+0x3b6/0xaa0 mm/vmscan.c:696
 shrink_folio_list+0x272c/0x4110 mm/vmscan.c:1402
 evict_folios+0x7c6/0x1aa0 mm/vmscan.c:4660
 try_to_shrink_lruvec+0x59a/0x9c0 mm/vmscan.c:4821
 shrink_one+0x417/0x7c0 mm/vmscan.c:4866
 shrink_many mm/vmscan.c:4929 [inline]
 lru_gen_shrink_node mm/vmscan.c:5007 [inline]
 shrink_node+0x2698/0x3d60 mm/vmscan.c:5978
 shrink_zones mm/vmscan.c:6237 [inline]
 do_try_to_free_pages+0x372/0x1990 mm/vmscan.c:6299
 try_to_free_pages+0x2a4/0x6b0 mm/vmscan.c:6549
page last free pid 38 tgid 38 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1127 [inline]
 free_frozen_pages+0x718/0xfd0 mm/page_alloc.c:2660
 __folio_put+0x321/0x440 mm/swap.c:112
 folio_put include/linux/mm.h:1489 [inline]
 migrate_folio_done+0x29d/0x340 mm/migrate.c:1180
 migrate_folio_move mm/migrate.c:1402 [inline]
 migrate_folios_move mm/migrate.c:1712 [inline]
 migrate_pages_batch+0x1c95/0x31e0 mm/migrate.c:1959
 migrate_pages_sync+0x10c/0x890 mm/migrate.c:1989
 migrate_pages+0x19fd/0x21c0 mm/migrate.c:2098
 compact_zone+0x1b3d/0x3e70 mm/compaction.c:2663
 compact_node+0x17f/0x2c0 mm/compaction.c:2932
 kcompactd+0x8b9/0xc80 mm/compaction.c:3226
 kthread+0x3b0/0x760 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Memory state around the buggy address:
 ffff888000916f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff888000916f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
&gt;ffff888000917000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                  ^
 ffff888000917080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff888000917100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
I use the same kernel as syzbot instance upstream: 7eb172143d5508b4da468ed59ee857c6e5e01da6
kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&amp;x=da4b04ae798b7ef6
compiler: gcc version 11.4.0
===============================================================================
Unfortunately, the modified syzkaller does not generate an effective repeat program.
The following is my analysis of the bug and repair suggestions, hoping to help with the repair of the bug:
Call chain positioning:
The vulnerability trigger path is z3fold_zpool_free() -&gt; z3fold_free() -&gt; get_z3fold_header() -&gt; z3fold_page_unlock()
The lock structure of the released page was accessed when spin_unlock() was called in z3fold_page_unlock()
Memory status verification:
If the refcount and mapcount of the physical page are 0, the page has been released
The Page Owner shows that the page was finally released via migrate_folio_done(), but the z3fold code is still accessed
Competition condition analysis:
Can occur when a memory compaction compacts with a z3fold release path
When the page is released by the migration process, z3fold still holds the stale page header (z3fold_header) reference
Lock life cycle issues:
The spinlock_t lock in the z3fold_header structure is stored in the released page
After the page is released, the related pointer is not set in time, causing subsequent unlocking operations to access invalid memory

Possible solutions£ºVerify the lock status before unlocking
static inline void z3fold_page_unlock(struct z3fold_header *zhdr)
{
+   if (unlikely(!PageZ3fold(page))) {
+       return; // The page has been released
+   }
	spin_unlock(&amp;zhdr-&gt;page_lock);

}
=========================================================================
I hope it helps.
Best regards
Jianzhou Zhao
xingwei lee
Zhizhuo Tang</strforexctzzchange@xxxxxxxxxxx></xrivendell7@xxxxxxxxx></xnxc22xnxc22@xxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux