On Tue, Sep 22, 2020 at 9:34 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Tue 22-09-20 09:29:48, Shakeel Butt wrote: > > On Tue, Sep 22, 2020 at 8:16 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Tue 22-09-20 06:37:02, Shakeel Butt wrote: > [...] > > > > I talked about this problem with Johannes at LPC 2019 and I think we > > > > talked about two potential solutions. First was to somehow give memory > > > > reserves to oomd and second was in-kernel PSI based oom-killer. I am > > > > not sure the first one will work in this situation but the second one > > > > might help. > > > > > > Why does your oomd depend on memory allocation? > > > > > > > It does not but I think my concern was the potential allocations > > during syscalls. > > So what is the problem then? Why your oomd cannot kill anything? > >From the dump, it seems like it is not able to get the CPU. I am still trying to extract the reason though. > > Anyways, what do you think of the in-kernel PSI based > > oom-kill trigger. I think Johannes had a prototype as well. > > We have talked about something like that in the past and established > that auto tuning for oom killer based on PSI is almost impossible to get > right for all potential workloads and that so this belongs to userspace. > The kernel's oom killer is there as a last resort when system gets close > to meltdown. The system is already in meltdown state from the users perspective. I still think allowing the users to optionally set the oom-kill trigger based on PSI makes sense. Something like 'if all processes on the system are stuck for 60 sec, trigger oom-killer'.