On (24/02/27 03:02), Chengming Zhou wrote: > free_zspage() has to hold locks of all pages, since zs_page_migrate() > path rely on this page lock to protect the race between zs_free() and > it, so it can safely get zspage from page->private. > > But this way is not good and simple enough: > > 1. Since zs_free() couldn't be sleepable, it can only trylock pages, > or has to kick_deferred_free() to defer that to a work. > > 2. Even in the worker context, async_free_zspage() can't simply > lock all pages in lock_zspage(), it's still trylock because of > the race between zs_free() and zs_page_migrate(). Please see > the commit 2505a981114d ("zsmalloc: fix races between asynchronous > zspage free and page migration") for details. > > Actually, all free_zspage() needs is to get zspage from page safely, > we can use RCU to achieve it easily. Then free_zspage() don't need to > hold locks of all pages, so don't need the deferred free mechanism > at all. This patchset implements it and remove all of deferred free > related code. > > Thanks for review and comments! > > Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > --- > Chengming Zhou (2): > mm/zsmalloc: don't hold locks of all pages when free_zspage() That seems to be crashing on me: [ 28.123867] ================================================================== [ 28.125303] BUG: KASAN: null-ptr-deref in obj_malloc+0xa9/0x1f0 [ 28.126289] Read of size 8 at addr 0000000000000028 by task mkfs.ext2/432 [ 28.127414] [ 28.127684] CPU: 8 PID: 432 Comm: mkfs.ext2 Tainted: G N 6.8.0-rc5+ #309 [ 28.129015] Call Trace: [ 28.129442] <TASK> [ 28.129805] dump_stack_lvl+0x6f/0xab [ 28.130437] print_report+0xe0/0x5e0 [ 28.131050] ? _printk+0x59/0x7b [ 28.131602] ? kasan_report+0x96/0x120 [ 28.132233] ? obj_malloc+0xa9/0x1f0 [ 28.132837] kasan_report+0xe7/0x120 [ 28.133441] ? obj_malloc+0xa9/0x1f0 [ 28.134046] obj_malloc+0xa9/0x1f0 [ 28.134633] zs_malloc+0x22c/0x3e0 [ 28.135211] zram_submit_bio+0x44e/0xee0 [ 28.135871] ? lock_release+0x50c/0x700 [ 28.136520] submit_bio_noacct_nocheck+0x22a/0x650 [ 28.137327] __block_write_full_folio+0x48b/0x710 [ 28.138119] ? __cfi_blkdev_get_block+0x10/0x10 [ 28.138885] ? __cfi_block_write_full_folio+0x10/0x10 [ 28.139737] write_cache_pages+0x83/0xf0 [ 28.140397] ? __cfi_blkdev_get_block+0x10/0x10 [ 28.141152] blkdev_writepages+0x46/0x80 [ 28.141810] do_writepages+0x1be/0x400 [ 28.142443] file_write_and_wait_range+0x104/0x170 [ 28.143254] blkdev_fsync+0x4a/0x70 [ 28.143846] __x64_sys_fsync+0xe9/0x120 [ 28.144491] do_syscall_64+0x8d/0x130 [ 28.145106] entry_SYSCALL_64_after_hwframe+0x46/0x4e