On Wed 18-05-16 17:18:45, Sebastian Frias wrote: > Hi Michal, > > On 05/17/2016 10:16 PM, Michal Hocko wrote: > > On Tue 17-05-16 18:16:58, Sebastian Frias wrote: [...] > > The global OOM means there is _no_ memory at all. Many kernel > > operations will need some memory to do something useful. Let's say you > > would want to do an educated guess about who to kill - most proc APIs > > will need to allocate. And this is just a beginning. Things are getting > > really nasty when you get deeper and deeper. E.g. the OOM killer has to > > give the oom victim access to memory reserves so that the task can exit > > because that path needs to allocate as well. > > Really? I would have thought that once that SIGKILL is sent, the > victim process is not expected to do anything else and thus its > memory could be claimed immediately. Or the OOM-killer is more of a > OOM-terminator? (i.e.: sends SIGTERM) Well, the path to exit is not exactly trivial. Resources have to be released and that requires memory sometimes. E.g. exit_robust_list needs to access the futex and that in turn means a page fault if the memory was swapped out... > >So even if you wanted to > > give userspace some chance to resolve the OOM situation you would either > > need some special API to tell "this process is really special and it can > > access memory reserves and it has an absolute priority etc." or have a > > in kernel fallback to do something or your system could lockup really > > easily. > > > > I see, so basically at least two cgroups would be needed, one reserved > for handling the OOM situation through some API and another for the > "rest of the system". Basically just like the 5% reserved for 'root' > on filesystems. If you want to handle memcg OOM then you can use memory.oom_control (see Documentation/cgroup-v1/memory.txt for more information) and have the oom handler outside of that memcg. > Do you think that would work? But handling the _global_ oom from userspace is just insane with the current kernel implementation. It just cannot work reliably. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>