Hi Damian, On Tue, Sep 24, 2024 at 10:52:28PM +0200, Damian Dudycz wrote: > I'm running Gentoo on the PlayStation 3 console (PPC64BE CPU), using custom > firmware (OtherOS++) feature. > > Upgrading from 6.6 to 6.10, I have noticed that OOM kills started during long > and intense processes, like compiling code or extracting a large archive. > > The OOM usually occurs after about 10-20 minutes of for example > compiling the gentoo-kernel package. Thanks for your excellent and detailed report, and sorry about the breakage. While going through the dmesg, I'm noticing the following: [ 719.989545] configure invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=2, oom_score_adj=0 [ 719.989607] COMPACTION is disabled!!! [ 719.989633] CPU: 1 PID: 4701 Comm: configure Not tainted 6.9.0-rc4-test-00116-gc0cd6f557b90-dirty #1 [ 719.989665] Hardware name: SonyPS3 Cell Broadband Engine 0x702100 PS3 [ 719.989688] Call Trace: [ 719.989708] [c00000000a5834a0] [c000000000662e9c] .dump_stack_lvl+0xb0/0x100 (unreliable) [ 719.989777] [c00000000a583530] [c00000000013e43c] .dump_header+0x5c/0x414 [ 719.989835] [c00000000a583600] [c00000000013ec38] .oom_kill_process+0xcc/0x598 [ 719.989888] [c00000000a5836f0] [c00000000013f6f0] .out_of_memory+0x3d0/0x3f0 [ 719.989939] [c00000000a5837a0] [c00000000018f28c] .__alloc_pages_slowpath.constprop.0+0x540/0x6b0 [ 719.989987] [c00000000a5838f0] [c00000000018f4f4] .__alloc_pages_noprof+0xf8/0x1c0 [ 719.990031] [c00000000a5839c0] [c0000000000505d0] .copy_process+0x1d4/0x1bf0 [ 719.990085] [c00000000a583b40] [c000000000052144] .kernel_clone+0xcc/0x3f0 [ 719.990136] [c00000000a583c50] [c0000000000524d4] .__do_sys_clone+0x6c/0x90 [ 719.990188] [c00000000a583d80] [c00000000001f600] .system_call_exception+0x1f4/0x260 [ 719.990246] [c00000000a583e10] [c00000000000b2d4] system_call_common+0xf4/0x258 This is clone() trying to allocate a thread stack, which is a request for 4 physically contiguous pages (order=2 -> 2^2 pages). The second line warns that you don't have CONFIG_COMPACTION enabled, which is the kernel's facility to assemble such contiguous page blocks. (God bless you, Michal Hocko, for adding this warning.) This is not a common configuration anymore, as we have since removed various other mechanisms from the MM code to support higher order allocations. So I think you may have gotten lucky in the past. Can you please try with CONFIG_COMPACTION=y? [ I think what likely happened is that, before my patch, an unmovable request falling back to a movable block would have stolen the rest of its free pages even if it hadn't claimed the block as unmovable. Now it doesn't anymore, and the block, already dominated by cache and anon, will continue to fill up with cache and anon. Not an issue with compaction - and better for long-term defragmentation prospects; but without compaction, you just get a bit less lucky specifically with those higher-order kernel requests. ] Thanks