On 4/8/22 06:51, Michal Hocko wrote: > On Fri 08-04-22 06:36:40, Nico Pache wrote: >> >> >> On 4/8/22 05:59, Michal Hocko wrote: >>> On Fri 08-04-22 05:40:09, Nico Pache wrote: >>>> >>>> >>>> On 4/8/22 05:36, Michal Hocko wrote: >>>>> On Fri 08-04-22 04:52:33, Nico Pache wrote: >>>>> [...] >>>>>> In a heavily contended CPU with high memory pressure the delay may also >>>>>> lead to other processes unnecessarily OOMing. >>>>> >>>>> Let me just comment on this part because there is likely a confusion >>>>> inlved. Delaying the oom_reaper _cannot_ lead to additional OOM killing >>>>> because the the oom killing is throttled by existence of a preexisting >>>>> OOM victim. In other words as long as there is an alive victim no >>>>> further victims are not selected and the oom killer backs off. The >>>>> oom_repaer will hide the alive oom victim after it is processed. >>>>> The longer the delay will be the longer an oom victim can block a >>>>> further progress but it cannot really cause unnecessary OOMing. >>>> Is it not the case that if we delay an OOM, the amount of available memory stays >>>> limited and other processes that are allocating memory can become OOM candidates? >>> >>> No. Have a look at oom_evaluate_task (tsk_is_oom_victim check). >> Ok I see. >> >> Doesnt the delay then allow the system to run into the following case more easily?: >> pr_warn("Out of memory and no killable processes...\n"); >> panic("System is deadlocked on memory\n"); > > No. Aborting the oom victim search (above mentioned) will cause > out_of_memory to bail out and return to the page allocator. Ok I see that now. I did my bit math incorrectly the first time around. I thought abort lead to the !oc->chosen case. > the only problem with delaying the oom_reaper is that _iff_ the oom > victim cannot terminate (because it is stuck somewhere in the kernel) > on its own then the oom situation (be it global, cpuset or memcg) will > take longer so allocating tasks will not be able to make a forward > progress. Ok so if i understand that correctly, delaying can have some ugly effects and kinda breaks the initial purpose of the OOM reaper? I personally don't like the delay approach. Especially if we have a better one we know is working, and that doesnt add regressions. If someone can prove to me the private lock case, I'd be more willing to bite. Thanks for all the OOM context :) -- Nico