>> BUG: sleeping function called from invalid context at mm/page_alloc.c:5179 >> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 >> >> __dump_stack lib/dump_stack.c:79 [inline] >> dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96 >> ___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153 >> prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179 >> __alloc_pages+0x12f/0x500 mm/page_alloc.c:5375 >> alloc_pages+0x18c/0x2a0 mm/mempolicy.c:2272 >> stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303 >> save_stack+0x15e/0x1e0 mm/page_owner.c:120 >> __set_page_owner+0x50/0x290 mm/page_owner.c:181 >> prep_new_page mm/page_alloc.c:2445 [inline] >> __alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313 >> >> The problem is caused by set_page_owner alloc memory to save stack with >> GFP_KERNEL in local_riq disabled. >> So, we just can't assume that alloc flags should be same with new page, >> prep_new_page should prep/trace the page gfp, but shouldn't use the same >> gfp to get memory, let's depend on caller. >> So, here is two gfp flags, alloc_gfp used to alloc memory, depend on >> caller, page_gfp_mask is page's gfp, used to trace/prep itself >> But in most situation, same is ok, in alloc_pages_bulk, use GFP_ATOMIC >> is ok.(even if set_page_owner save backtrace failed, limited impact) >> >> v2: >> - add more description. >> >> Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator") >> Reported-by: syzbot+b07d8440edb5f8988eea@xxxxxxxxxxxxxxxxxxxxxxxxx >> Suggested-by: Wang Qing <wangqing@xxxxxxxx> >> Signed-off-by: Yang Huan <link@xxxxxxxx> > >https://lore.kernel.org/lkml/20210713152100.10381-2-mgorman@xxxxxxxxxxxxxxxxxxx/ >is now part of a series that has being sent to Linus. Hence, the Fixes >part is no longer applicable and the patch will no longer be addresing >an atomic sleep bug. This patch should be treated as an enhancement Hi Mel Gorman, thanks for your reply. I see the fix patch, it fix this bug by abandon alloc bulk feature when page_owner is set. But in my opinion, it can't really fix this bug, it's a circumvention plan. >to allow bulk allocations when PAGE_OWNER is set. For that, it should >include a note on the performance if PAGE_OWNER is used with either NFS >or high-speed networking to justify the additional complexity. My patch just split the prep_new_page page_gfp into alloc_gfp(for alloc bulk is GFP_ATOMIC, for other's no change) and trace page gfp. So, we will not use the error way to get memory. So, I think this will not affect alloc bulk performance when page_owner is on(compare with origin patch) but can really fix this bug rather than evade. And this patch can let alloc bulk feature and page_owner feature work togher So, I will send patch again based on the fix patch. Thank you Yang Huan > >-- >Mel Gorman >SUSE Labs