On Tue 31-10-17 12:49:59, Johannes Weiner wrote: > On Tue, Oct 31, 2017 at 09:00:48AM +0100, Michal Hocko wrote: > > On Mon 30-10-17 12:28:13, Shakeel Butt wrote: > > > On Mon, Oct 30, 2017 at 1:29 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Fri 27-10-17 13:50:47, Shakeel Butt wrote: > > > >> > Why is OOM-disabling a thing? Why isn't this simply a "kill everything > > > >> > else before you kill me"? It's crashing the kernel in trying to > > > >> > protect a userspace application. How is that not insane? > > > >> > > > >> In parallel to other discussion, I think we should definitely move > > > >> from "completely oom-disabled" semantics to something similar to "kill > > > >> me last" semantics. Is there any objection to this idea? > > > > > > > > Could you be more specific what you mean? > > > > > > I get the impression that the main reason behind the complexity of > > > oom-killer is allowing processes to be protected from the oom-killer > > > i.e. disabling oom-killing a process by setting > > > /proc/[pid]/oom_score_adj to -1000. So, instead of oom-disabling, add > > > an interface which will let users/admins to set a process to be > > > oom-killed as a last resort. > > > > If a process opts in to be oom disabled it needs CAP_SYS_RESOURCE and it > > probably has a strong reason to do that. E.g. no unexpected SIGKILL > > which could leave inconsistent data behind. We cannot simply break that > > contract. Yes, it is a PITA configuration to support but it has its > > reasons to exit. > > I don't think that's true. The most prominent users are things like X > and sshd, and all they wanted to say was "kill me last." This might be the case for the desktop environment and I would tend to agree that those can handle restart easily. I was considering applications which need an explicit shut down and manual intervention when not done so. Think of a database or similar. > If sshd were to have a bug and swell up, currently the system would > kill everything and then panic. It'd be much better to kill sshd at > the end and let the init system restart it. > > Can you describe a scenario in which the NEVERKILL semantics actually > make sense? You're still OOM-killing the task anyway, it's not like it > can run without the kernel. So why kill the kernel? Yes but you start with a clean state after reboot which is rather a different thing than restarting from an inconsistant state. In any case I am not trying to defend this configuration! I really dislike it and it shouldn't have ever been introduced. But it is an established behavior for many years and I am not really willing to break it without having a _really strong_ reason. -- Michal Hocko SUSE Labs