On Tue, Jan 7, 2020 at 1:48 PM Mark Otaris <mark@xxxxxxxxx> wrote: > > I intended to demonstrate that cgroups can be used to cause the kernel OOM > killer to react appropriately and fast enough, implying that replacing the > OOM killer is not necessary and that replacing it by a userspace OOM killer > that does not account for cgroups can be undesirable. The exact same controls > set with my example commands, and others, can be set with scopes as well, > so this should be applicable. > > > https://lore.kernel.org/linux-fsdevel/20200104090955.GF23195@xxxxxxxxxxxxxxxxxxx/T/#m8b25fd42501d780d8053fc7aa9f4e3a28a19c49f > > Okay, interesting. But that’s a statement from just one person, and it has to > be interpreted in the context of what it is confirming; that is, that the OOM > killer is “mainly concerned about kernel survival in low memory situations”, > which is weaker than your claim that “their concern with kernel oom-killer is > strictly with keeping the kernel functioning”. I don’t know if the OOM killer’s > main purpose is to keep the kernel alive (Michal Hocko appears to think so, > maybe others disagree), but it is in any case not an abuse of the OOM killer to > also use it to keep userspace responsive, The oom killer doesn't keep user space responsive per se, in your example that's done by cgroups restricting resources. And that's neat, and necessary to keep making forward progress on. But we don't have that for unprivileged process right now, unless the user knows the secret decoder ring command to use to do this every time they run something in Terminal; and then have some idea to hint at what resources are needed for the task to succeed rather than just get clobbered anyway. That's maybe the elephant in the room with earlyoom (or one of them), yes we've recovered sooner, the user can hopefully save their data and reboot. But did their task succeed? No. It got clobbered. >and there is no reason to think that > kernel folks are not interested in helping achieve this goal. I did mean with a kernel only solution. I've been tracking this issue for 6-7 months including the congestion and kswapd discussions on-going, so I know they do care broadly about providing some mechanisms by which user space can better behave. But all of that requires varying degrees of opt-in, and quite a lot of it involves considerable work to even understand it, let alone implement it. >The only > advantage I see to earlyoom so far is that it sends SIGTERM before taking > further steps that will kill processes. Yes and it happens sooner. Probably not soon enough for many users. There may be some risk by overpromising and under delivering: by making it the default and then for the vast majority of cases it doesn't matter, because users are long since conditioned to just force power off within a minute or less of the GUI stuttering or freezing up on them. It is very workload and system specific. -- Chris Murphy _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx