On Wed, 2024-12-11 at 09:30 -0800, Yosry Ahmed wrote: > On Wed, Dec 11, 2024 at 9:20 AM Rik van Riel <riel@xxxxxxxxxxx> > wrote: > > > > On Wed, 2024-12-11 at 09:00 -0800, Yosry Ahmed wrote: > > > On Wed, Dec 11, 2024 at 8:34 AM Rik van Riel <riel@xxxxxxxxxxx> > > > wrote: > > > > > > > > > If it is a kernel directed memcg OOM kill, that is > > > > true. > > > > > > > > However, if the exit comes from somewhere else, > > > > like a userspace oomd kill, we might not hit that > > > > code path. > > > > > > Why do we treat dying tasks differently based on the source of > > > the > > > kill? > > > > > Are you saying we should fail allocations for > > every dying task, and add a check for PF_EXITING > > in here? > > I am asking, not really suggesting anything :) > > Does it matter from the kernel perspective if the task is dying due > to > a kernel OOM kill or a userspace SIGKILL? > Currently, it does. I'm not sure it should, but currently it does :/ We are dealing with two conflicting demands here. On the one hand, we want the exit code to be able to access things like futex memory, so it can properly clean up everything the program left behind. On the other hand, we don't want the exiting program to drive up cgroup memory use, especially not with memory that won't be reclaimed by the exit. My patch is an attempt to satisfy both of these demands, in situations where we currently exhibit a rather pathological behavior (glacially slow exit). -- All Rights Reversed.