On Fri, Jun 03, 2022 at 02:57:47PM +0900, Jaewon Kim wrote: > The atomic page allocation failure sometimes happened, and most of them > seem to occur during boot time. > > <4>[ 59.707645] system_server: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=foreground-boost,mems_allowed=0 > <4>[ 59.707676] CPU: 5 PID: 1209 Comm: system_server Tainted: G S O 5.4.161-qgki-24219806-abA236USQU0AVE1 #1 > <4>[ 59.707691] Call trace: > <4>[ 59.707702] dump_backtrace.cfi_jt+0x0/0x4 > <4>[ 59.707712] show_stack+0x18/0x24 > <4>[ 59.707719] dump_stack+0xa4/0xe0 > <4>[ 59.707728] warn_alloc+0x114/0x194 > <4>[ 59.707734] __alloc_pages_slowpath+0x828/0x83c > <4>[ 59.707740] __alloc_pages_nodemask+0x2b4/0x310 > <4>[ 59.707747] alloc_slab_page+0x40/0x5c8 > <4>[ 59.707753] new_slab+0x404/0x420 > <4>[ 59.707759] ___slab_alloc+0x224/0x3b0 > <4>[ 59.707765] __kmalloc+0x37c/0x394 > <4>[ 59.707773] context_struct_to_string+0x110/0x1b8 > <4>[ 59.707778] context_add_hash+0x6c/0xc8 > <4>[ 59.707785] security_compute_sid.llvm.13699573597798246927+0x508/0x5d8 > <4>[ 59.707792] security_transition_sid+0x2c/0x38 > <4>[ 59.707804] selinux_socket_create+0xa0/0xd8 > <4>[ 59.707811] security_socket_create+0x68/0xbc > <4>[ 59.707818] __sock_create+0x8c/0x2f8 > <4>[ 59.707823] __sys_socket+0x94/0x19c > <4>[ 59.707829] __arm64_sys_socket+0x20/0x30 > <4>[ 59.707836] el0_svc_common+0x100/0x1e0 > <4>[ 59.707841] el0_svc_handler+0x68/0x74 > <4>[ 59.707848] el0_svc+0x8/0xc > <4>[ 59.707853] Mem-Info: > <4>[ 59.707890] active_anon:223569 inactive_anon:74412 isolated_anon:0 > <4>[ 59.707890] active_file:51395 inactive_file:176622 isolated_file:0 > <4>[ 59.707890] unevictable:1018 dirty:211 writeback:4 unstable:0 > <4>[ 59.707890] slab_reclaimable:14398 slab_unreclaimable:61909 > <4>[ 59.707890] mapped:134779 shmem:1231 pagetables:26706 bounce:0 > <4>[ 59.707890] free:528 free_pcp:844 free_cma:147 > <4>[ 59.707900] Node 0 active_anon:894276kB inactive_anon:297648kB active_file:205580kB inactive_file:706488kB unevictable:4072kB isolated(anon):0kB isolated(file):0kB mapped:539116kB dirty:844kB writeback:16kB shmem:4924kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no > <4>[ 59.707912] Normal free:2112kB min:7244kB low:68892kB high:72180kB active_anon:893140kB inactive_anon:297660kB active_file:204740kB inactive_file:706396kB unevictable:4072kB writepending:860kB present:3626812kB managed:3288700kB mlocked:4068kB kernel_stack:62416kB shadow_call_stack:15656kB pagetables:106824kB bounce:0kB free_pcp:3372kB local_pcp:176kB free_cma:588kB > <4>[ 59.707915] lowmem_reserve[]: 0 0 > <4>[ 59.707922] Normal: 8*4kB (H) 5*8kB (H) 13*16kB (H) 25*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1080kB > <4>[ 59.707942] 242549 total pagecache pages > <4>[ 59.707951] 12446 pages in swap cache > <4>[ 59.707956] Swap cache stats: add 212408, delete 199969, find 36869/71571 > <4>[ 59.707961] Free swap = 3445756kB > <4>[ 59.707965] Total swap = 4194300kB > <4>[ 59.707969] 906703 pages RAM > <4>[ 59.707973] 0 pages HighMem/MovableOnly > <4>[ 59.707978] 84528 pages reserved > <4>[ 59.707982] 49152 pages cma reserved > > The kswapd or other reclaim contexts may not prepare enough free pages > for too many atomic allocations occurred in short time. But zram may not > be helpful for this atomic allocation even though zram is used to > reclaim. > > To get one zs object for a specific size, zram may allocate serveral > pages. And this can be happened on different class sizes at the same > time. It means zram may consume more pages to reclaim only one page. > This inefficiency may consume all free pages below watmerk min by a > process having PF_MEMALLOC like kswapd. However, that's how zram has worked for a long time(allocate memory under memory pressure) and many folks already have raised min_free_kbytes when they use zram as swap. If we don't allow the allocation, swap out fails easier than old, which would break existing tunes. > > We can avoid this by adding __GFP_NOMEMALLOC. PF_MEMALLOC process won't > use ALLOC_NO_WATERMARKS. > > Signed-off-by: Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> > --- > drivers/block/zram/zram_drv.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index b8549c61ff2c..39cd1397ed3b 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -1383,6 +1383,7 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec, > > handle = zs_malloc(zram->mem_pool, comp_len, > __GFP_KSWAPD_RECLAIM | > + __GFP_NOMEMALLOC | > __GFP_NOWARN | > __GFP_HIGHMEM | > __GFP_MOVABLE); > -- > 2.17.1 > >