On Tue 12-03-19 09:37:41, Sultan Alsawaf wrote: > On Tue, Mar 12, 2019 at 09:05:32AM +0100, Michal Hocko wrote: > > The only way to control the OOM behavior pro-actively is to throttle > > allocation speed. We have memcg high limit for that purpose. Along with > > PSI, I can imagine a reasonably working user space early oom > > notifications and reasonable acting upon that. > > The issue with pro-active memory management that prompted me to create this was > poor memory utilization. All of the alternative means of reclaiming pages in the > page allocator's slow path turn out to be very useful for maximizing memory > utilization, which is something that we would have to forgo by relying on a > purely pro-active solution. I have not had a chance to look at PSI yet, but > unless a PSI-enabled solution allows allocations to reach the same point as when > the OOM killer is invoked (which is contradictory to what it sets out to do), > then it cannot take advantage of all of the alternative memory-reclaim means > employed in the slowpath, and will result in killing a process before it is > _really_ necessary. If you really want to reach the real OOM situation then you can very well rely on the in-kernel OOM killer. The only reason you want a customized oom killer is the tasks clasification. And that is a different story. User space hints on the victim selection has been a topic for quite while. It never get to any conclusion as interested parties have always lost an interest because it got hairy quickly. > > If you design is relies on the speed of killing then it is fundamentally > > flawed AFAICT. You cannot assume anything about how quickly a task dies. > > It might be blocked in an uninterruptible sleep or performin an > > operation which takes some time. Sure, oom_reaper might help here but > > still. > > In theory we could instantly zap any process that is not trapped in the kernel > at the time that the OOM killer is invoked without any consequences though, no? No, this is not so simple. Have a look at the oom_reaper and hops it has to go through. -- Michal Hocko SUSE Labs