Hi Michal, On 05/17/2016 10:16 PM, Michal Hocko wrote: > On Tue 17-05-16 18:16:58, Sebastian Frias wrote: > [...] >> From reading Documentation/cgroup-v1/memory.txt (and from a few >> replies here talking about cgroups), it looks like the OOM-killer is >> still being actively discussed, well, there's also "cgroup-v2". >> My understanding is that cgroup's memory control will pause processes >> in a given cgroup until the OOM situation is solved for that cgroup, >> right? > > It will be blocked waiting either for some external action which would > result in OOM codition going away or any other charge release. You have > to configure memcg for that though. The default behavior is to invoke > the same OOM killer algorithm which is just reduced to tasks from the > memcg (hierarchy). Ok, I see, thanks! > >> If that is right, it means that there is indeed a way to deal >> with an OOM situation (stack expansion, COW failure, 'memory hog', >> etc.) in a better way than the OOM-killer, right? >> In which case, do you guys know if there is a way to make the whole >> system behave as if it was inside a cgroup? (*) > > No it is not. You have to realize that the system wide and the memcg OOM > situations are quite different. There is usually quite some memory free > when you hit the memcg OOM so the administrator can actually do > something. Ok, so it works like the 5% reserved for 'root' on filesystems? >The global OOM means there is _no_ memory at all. Many kernel > operations will need some memory to do something useful. Let's say you > would want to do an educated guess about who to kill - most proc APIs > will need to allocate. And this is just a beginning. Things are getting > really nasty when you get deeper and deeper. E.g. the OOM killer has to > give the oom victim access to memory reserves so that the task can exit > because that path needs to allocate as well. Really? I would have thought that once that SIGKILL is sent, the victim process is not expected to do anything else and thus its memory could be claimed immediately. Or the OOM-killer is more of a OOM-terminator? (i.e.: sends SIGTERM) >So even if you wanted to > give userspace some chance to resolve the OOM situation you would either > need some special API to tell "this process is really special and it can > access memory reserves and it has an absolute priority etc." or have a > in kernel fallback to do something or your system could lockup really > easily. > I see, so basically at least two cgroups would be needed, one reserved for handling the OOM situation through some API and another for the "rest of the system". Basically just like the 5% reserved for 'root' on filesystems. Do you think that would work? Best regards, Sebastian -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>