We encountered a deadlock issue when using zram on kernel version 5.10. The reason is that the backing_dev_store will first hold the zram->init_lock, and then it will allocate memory with kvmalloc_node. At this moment, the system may be in a state of memory shortage, so it will enter the memory swapping process. In zram_bvec_write, we will trigger our own thread to move data from zram to the backing device to free up more available memory, which is the work done in the try_wakeup_zram_wbd. In this function, zram->init_lock will be acquired again to read zram->bd_wb_limit, which causes a deadlock as follow call trace. The latest version of Linux does not have the bug, but a potential risk in future.If we try to acquire zram->init_lock again in the zram process (for example, when we need to read zram->bd_wb_limit), there is a risk of deadlock. The bug is quite elusive, and it only appears with low probability after conducting a week-long stress test using 50 devices. Therefore, I suggest passing the GFP_ATOMIC flag to allocate memory in the backing_dev_store, to avoid similar issues we encountered. Please consider it, Thank you. INFO: task init:331 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:init state:D stack: 0 pid: 1 ppid: 0 flags:0x04000000 Call trace: __switch_to+0x244/0x4e4 __schedule+0x5bc/0xc48 schedule+0x80/0x164 rwsem_down_read_slowpath+0x4fc/0xf9c __down_read+0x140/0x188 down_read+0x14/0x24 try_wakeup_wbd_thread+0x78/0x1ec [zram] __zram_bvec_write+0x720/0x878 [zram] zram_bvec_rw+0xa8/0x234 [zram] zram_submit_bio+0x16c/0x268 [zram] submit_bio_noacct+0x128/0x3c8 submit_bio+0x1cc/0x3d0 __swap_writepage+0x5c4/0xd4c swap_writepage+0x130/0x158 pageout+0x1f4/0x478 shrink_page_list+0x9b4/0x1eb8 shrink_inactive_list+0x2f4/0xaa8 shrink_lruvec+0x184/0x340 shrink_node_memcgs+0x84/0x3a0 shrink_node+0x2c4/0x6c4 shrink_zones+0x16c/0x29c do_try_to_free_pages+0xe4/0x2b4 try_to_free_pages+0x388/0x7b4 __alloc_pages_direct_reclaim+0x88/0x278 __alloc_pages_slowpath+0x4ec/0xf6c __alloc_pages_nodemask+0x1f4/0x3dc kmalloc_order+0x54/0x338 kmalloc_order_trace+0x34/0x1bc __kmalloc+0x5e8/0x9c0 kvmalloc_node+0xa8/0x264 backing_dev_store+0x1a4/0x818 [zram] dev_attr_store+0x38/0x8c sysfs_kf_write+0x64/0xc4 kernfs_fop_write_iter+0x168/0x2ac vfs_write+0x300/0x37c ksys_write+0x7c/0xf0 __arm64_sys_write+0x20/0x30 el0_svc_common+0xd4/0x270 el0_svc+0x28/0x98 el0_sync_handler+0x8c/0xf0 el0_sync+0x1b4/0x1c0 Signed-off-by: Dongyun Liu <dongyun.liu@xxxxxxxxxxxxx> --- drivers/block/zram/zram_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index d77d3664ca08..ee6c22c50e09 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -514,7 +514,7 @@ static ssize_t backing_dev_store(struct device *dev, nr_pages = i_size_read(inode) >> PAGE_SHIFT; bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); - bitmap = kvzalloc(bitmap_sz, GFP_KERNEL); + bitmap = kmalloc(bitmap_sz, GFP_ATOMIC); if (!bitmap) { err = -ENOMEM; goto out; -- 2.25.1