Re: OOMs on PS3 since kernel 6.9-rc4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Damian,

On Tue, Sep 24, 2024 at 10:52:28PM +0200, Damian Dudycz wrote:
> I'm running Gentoo on the PlayStation 3 console (PPC64BE CPU), using custom
> firmware (OtherOS++) feature.
> 
> Upgrading from 6.6 to 6.10, I have noticed that OOM kills started during long
> and intense processes, like compiling code or extracting a large archive.
> 
> The OOM usually occurs after about 10-20 minutes of for example
> compiling the gentoo-kernel package.

Thanks for your excellent and detailed report, and sorry about the
breakage.

While going through the dmesg, I'm noticing the following:

[  719.989545] configure invoked oom-killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=2, oom_score_adj=0
[  719.989607] COMPACTION is disabled!!!
[  719.989633] CPU: 1 PID: 4701 Comm: configure Not tainted 6.9.0-rc4-test-00116-gc0cd6f557b90-dirty #1
[  719.989665] Hardware name: SonyPS3 Cell Broadband Engine 0x702100 PS3
[  719.989688] Call Trace:
[  719.989708] [c00000000a5834a0] [c000000000662e9c] .dump_stack_lvl+0xb0/0x100 (unreliable)
[  719.989777] [c00000000a583530] [c00000000013e43c] .dump_header+0x5c/0x414
[  719.989835] [c00000000a583600] [c00000000013ec38] .oom_kill_process+0xcc/0x598
[  719.989888] [c00000000a5836f0] [c00000000013f6f0] .out_of_memory+0x3d0/0x3f0
[  719.989939] [c00000000a5837a0] [c00000000018f28c] .__alloc_pages_slowpath.constprop.0+0x540/0x6b0
[  719.989987] [c00000000a5838f0] [c00000000018f4f4] .__alloc_pages_noprof+0xf8/0x1c0
[  719.990031] [c00000000a5839c0] [c0000000000505d0] .copy_process+0x1d4/0x1bf0
[  719.990085] [c00000000a583b40] [c000000000052144] .kernel_clone+0xcc/0x3f0
[  719.990136] [c00000000a583c50] [c0000000000524d4] .__do_sys_clone+0x6c/0x90
[  719.990188] [c00000000a583d80] [c00000000001f600] .system_call_exception+0x1f4/0x260
[  719.990246] [c00000000a583e10] [c00000000000b2d4] system_call_common+0xf4/0x258

This is clone() trying to allocate a thread stack, which is a request
for 4 physically contiguous pages (order=2 -> 2^2 pages).

The second line warns that you don't have CONFIG_COMPACTION enabled,
which is the kernel's facility to assemble such contiguous page
blocks. (God bless you, Michal Hocko, for adding this warning.)

This is not a common configuration anymore, as we have since removed
various other mechanisms from the MM code to support higher order
allocations. So I think you may have gotten lucky in the past.

Can you please try with CONFIG_COMPACTION=y?

[ I think what likely happened is that, before my patch, an unmovable
  request falling back to a movable block would have stolen the rest
  of its free pages even if it hadn't claimed the block as unmovable.
  Now it doesn't anymore, and the block, already dominated by cache
  and anon, will continue to fill up with cache and anon. Not an issue
  with compaction - and better for long-term defragmentation
  prospects; but without compaction, you just get a bit less lucky
  specifically with those higher-order kernel requests. ]

Thanks




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux