On Mon 07-06-21 15:26:00, Waiman Long wrote: > On 6/7/21 3:01 PM, Michal Hocko wrote: > > On Mon 07-06-21 17:31:03, Aaron Tomlin wrote: > > > At the present time, in the context of memcg OOM, even when > > > sysctl_oom_kill_allocating_task is enabled/or set, the "allocating" > > > task cannot be selected, as a target for the OOM killer. > > > > > > This patch removes the restriction entirely. > > This is a global oom policy not a memcg specific one so a historical > > behavior would change. So I do not think we can change that. The policy > > can be implemented on the memcg level but this would require a much more > > detailed explanation of the usecase and the semantic (e.g. wrt. > > hierarchical behavior etc). > > Maybe we can extend the meaning of oom_kill_allocating_task such that memcg > OOM killing of allocating task is only enabled when bit 1 is set. So if an > existing application just set oom_kill_allocating_task to 1, it will not be > impacted. panic_on_oom is already allowing to implement originally global policy to memcg. So if anything this policy should follow the same interface but still I think what you are seeing is either a bug or something else (e.g. the task being migrated while the oom is ongoing) and this should be properly investigated and explained. We cannot simply paper it over by telling people to use oom_kill_allocating_task to work it around. If there is a real usecase for such a policy for memcg oom killing can be discussed of course. -- Michal Hocko SUSE Labs