On 2024/02/23 13:43, Sergey Senozhatsky wrote: > On (24/02/23 11:10), Tetsuo Handa wrote: >> >> I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e. >> Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle. > > Can we please run a bisect? Bisection pointed at commit afb2d666d025 ("zsmalloc: use copy_page for full page copy"), for copy_page() is implemented as non-instrumented code where KMSAN cannot handle. On x86_64, copy_page() is defined at arch/x86/lib/copy_page_64.S as below. ---------------------------------------- /* * Some CPUs run faster using the string copy instructions (sane microcode). * It is also a lot simpler. Use this when possible. But, don't use streaming * copy unless the CPU indicates X86_FEATURE_REP_GOOD. Could vary the * prefetch distance based on SMP/UP. */ ALIGN SYM_FUNC_START(copy_page) ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD movl $4096/8, %ecx rep movsq RET SYM_FUNC_END(copy_page) EXPORT_SYMBOL(copy_page) ---------------------------------------- To fix this problem, we need to implement copy_page() etc. in a way KMSAN can handle. Question to KASAN people: Is it possible to add annotation for KMSAN into assembly code? Do we need to disable assembly version and force use of C version when KMSAN is enabled? > > There are some zsmalloc patches for 6.8 (mm-unstable), I don't recall > anything in 6.7. > >> ---------------------------------------- >> [ 0.000000][ T0] Linux version 6.8.0-rc5-00163-gffd2cb6b718e (root@ubuntu) (Ubuntu clang version 14.0.0-1ubuntu1.1, Ubuntu LLD 14.0.0) #1094 SMP PREEMPT_DYNAMIC Fri Feb 23 01:45:21 UTC 2024 >> [ 50.026544][ T2974] ===================================================== >> [ 50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0 >> [ 50.034611][ T2974] obj_malloc+0x6cc/0x7b0 >> obj_malloc at mm/zsmalloc.c:0 >> [ 50.037250][ T2974] zs_malloc+0xdbd/0x1400 >> zs_malloc at mm/zsmalloc.c:0 >> [ 50.039852][ T2974] zs_zpool_malloc+0xa5/0x1b0 >> zs_zpool_malloc at mm/zsmalloc.c:372 >> [ 50.044707][ T2974] zpool_malloc+0x110/0x150 >> zpool_malloc at mm/zpool.c:258 >> [ 50.049607][ T2974] zswap_store+0x2bbb/0x3d30 >> zswap_store at mm/zswap.c:1637 >> [ 50.054463][ T2974] swap_writepage+0x15b/0x4f0 >> swap_writepage at mm/page_io.c:198 >> [ 50.059392][ T2974] pageout+0x41d/0xef0 >> pageout at mm/vmscan.c:654 >> [ 50.064057][ T2974] shrink_folio_list+0x4d7a/0x7480 >> shrink_folio_list at mm/vmscan.c:1316 >> [ 50.069176][ T2974] evict_folios+0x30f1/0x5170 >> evict_folios at mm/vmscan.c:4521 >> [ 50.074082][ T2974] try_to_shrink_lruvec+0x983/0xd20 >> [ 50.079352][ T2974] shrink_one+0x72d/0xeb0 >> [ 50.084061][ T2974] shrink_many+0x70d/0x10b0 >> [ 50.088859][ T2974] lru_gen_shrink_node+0x577/0x850 >> [ 50.094192][ T2974] shrink_node+0x13d/0x1de0 >> [ 50.099028][ T2974] shrink_zones+0x878/0x14a0 >> [ 50.103958][ T2974] do_try_to_free_pages+0x2ac/0x16a0 >> [ 50.109138][ T2974] try_to_free_pages+0xd9e/0x1910 >> [ 50.114190][ T2974] __alloc_pages_slowpath+0x147a/0x2bd0 >> [ 50.119555][ T2974] __alloc_pages+0xb8c/0x1050 >> [ 50.124472][ T2974] alloc_pages_mpol+0x8e0/0xc80 >> [ 50.129367][ T2974] alloc_pages+0x224/0x240 >> [ 50.134022][ T2974] pipe_write+0xabe/0x2ba0 >> [ 50.138632][ T2974] vfs_write+0xfb0/0x1b80 >> [ 50.143171][ T2974] ksys_write+0x275/0x500 >> [ 50.147723][ T2974] __x64_sys_write+0xdf/0x120 >> [ 50.152431][ T2974] do_syscall_64+0xd1/0x1b0 >> [ 50.157106][ T2974] entry_SYSCALL_64_after_hwframe+0x63/0x6b >> [ 50.162382][ T2974] >> [ 50.165956][ T2974] Uninit was stored to memory at: >> [ 50.170819][ T2974] obj_malloc+0x70a/0x7b0 >> set_freeobj at mm/zsmalloc.c:476 >> (inlined by) obj_malloc at mm/zsmalloc.c:1333 >> [ 50.175341][ T2974] zs_malloc+0xdbd/0x1400 >> zs_malloc at mm/zsmalloc.c:0 >> [ 50.179923][ T2974] zs_zpool_malloc+0xa5/0x1b0 >> zs_zpool_malloc at mm/zsmalloc.c:372 >> [ 50.184636][ T2974] zpool_malloc+0x110/0x150 >> zpool_malloc at mm/zpool.c:258 >> [ 50.189257][ T2974] zswap_store+0x2bbb/0x3d30 >> zswap_store at mm/zswap.c:1637 >> [ 50.193918][ T2974] swap_writepage+0x15b/0x4f0 >> swap_writepage at mm/page_io.c:198 >> [ 50.198615][ T2974] pageout+0x41d/0xef0 >> pageout at mm/vmscan.c:654 >> [ 50.203012][ T2974] shrink_folio_list+0x4d7a/0x7480 >> shrink_folio_list at mm/vmscan.c:1316 >> [ 50.207772][ T2974] evict_folios+0x30f1/0x5170 >> evict_folios at mm/vmscan.c:4521 >> [ 50.212321][ T2974] try_to_shrink_lruvec+0x983/0xd20 >> [ 50.217092][ T2974] shrink_one+0x72d/0xeb0 >> [ 50.221441][ T2974] shrink_many+0x70d/0x10b0 >> [ 50.225891][ T2974] lru_gen_shrink_node+0x577/0x850 >> [ 50.230614][ T2974] shrink_node+0x13d/0x1de0 >> [ 50.235128][ T2974] shrink_zones+0x878/0x14a0 >> [ 50.239646][ T2974] do_try_to_free_pages+0x2ac/0x16a0 >> [ 50.244461][ T2974] try_to_free_pages+0xd9e/0x1910 >> [ 50.249151][ T2974] __alloc_pages_slowpath+0x147a/0x2bd0 >> [ 50.254148][ T2974] __alloc_pages+0xb8c/0x1050 >> [ 50.258679][ T2974] alloc_pages_mpol+0x8e0/0xc80 >> [ 50.263289][ T2974] alloc_pages+0x224/0x240 >> [ 50.267767][ T2974] pipe_write+0xabe/0x2ba0 >> [ 50.272190][ T2974] vfs_write+0xfb0/0x1b80 >> [ 50.276543][ T2974] ksys_write+0x275/0x500 >> [ 50.280931][ T2974] __x64_sys_write+0xdf/0x120 >> [ 50.289451][ T2974] do_syscall_64+0xd1/0x1b0 >> [ 50.303402][ T2974] entry_SYSCALL_64_after_hwframe+0x63/0x6b >> [ 50.318721][ T2974] >> [ 50.328931][ T2974] Uninit was created at: >> [ 50.341845][ T2974] free_unref_page_prepare+0x130/0xfc0 >> arch_static_branch_jump at arch/x86/include/asm/jump_label.h:55 >> (inlined by) memcg_kmem_online at include/linux/memcontrol.h:1840 >> (inlined by) free_pages_prepare at mm/page_alloc.c:1096 >> (inlined by) free_unref_page_prepare at mm/page_alloc.c:2346 >> [ 50.356492][ T2974] free_unref_page_list+0x139/0x1050 >> free_unref_page_list at mm/page_alloc.c:2532 >> [ 50.370898][ T2974] shrink_folio_list+0x7139/0x7480 >> list_empty at include/linux/list.h:373 >> (inlined by) list_splice at include/linux/list.h:545 >> (inlined by) shrink_folio_list at mm/vmscan.c:1490 >> [ 50.385025][ T2974] evict_folios+0x30f1/0x5170 >> evict_folios at mm/vmscan.c:4521 >> [ 50.398448][ T2974] try_to_shrink_lruvec+0x983/0xd20 >> [ 50.412660][ T2974] shrink_one+0x72d/0xeb0 >> [ 50.425591][ T2974] shrink_many+0x70d/0x10b0 >> [ 50.438827][ T2974] lru_gen_shrink_node+0x577/0x850 >> [ 50.454390][ T2974] shrink_node+0x13d/0x1de0 >> [ 50.479401][ T2974] shrink_zones+0x878/0x14a0 >> [ 50.529610][ T2974] do_try_to_free_pages+0x2ac/0x16a0 >> [ 50.544397][ T2974] try_to_free_pages+0xd9e/0x1910 >> [ 50.559556][ T2974] __alloc_pages_slowpath+0x147a/0x2bd0 >> [ 50.574932][ T2974] __alloc_pages+0xb8c/0x1050 >> [ 50.589024][ T2974] alloc_pages_mpol+0x8e0/0xc80 >> [ 50.603421][ T2974] alloc_pages+0x224/0x240 >> [ 50.616483][ T2974] pipe_write+0xabe/0x2ba0 >> [ 50.629601][ T2974] vfs_write+0xfb0/0x1b80 >> [ 50.643009][ T2974] ksys_write+0x275/0x500 >> [ 50.656157][ T2974] __x64_sys_write+0xdf/0x120 >> [ 50.670080][ T2974] do_syscall_64+0xd1/0x1b0 >> [ 50.683405][ T2974] entry_SYSCALL_64_after_hwframe+0x63/0x6b >> [ 50.698626][ T2974] >> ----------------------------------------