On Fr, 03.01.20 14:18, Ben Cotton (bcotton@xxxxxxxxxx) wrote: > https://fedoraproject.org/wiki/Changes/EnableEarlyoom > > == Summary == > Install earlyoom package, and enable it by default. This will cause > the kernel oomkiller to trigger sooner, but will not affect which > process it chooses to kill off. The idea is to recover from out of > memory situations sooner, rather than the typical complete system hang > in which the user has no other choice but to force power off. Hmm, are we sure this is something we want to have in the default install? Is the code really good enough for that? Looking at the sources very superficially I see a couple of problems: 1. Waking up all the time in 100ms intervals? We generally try to avoid waking the CPU up all the time if nothing happens. Saving power and things. 2. New code using system() in the year 2020? Really? 3. Fixed size buffers and implicit, undetected, truncation of strings at various places (for example, when formatting the shell string to pass to system()). But more importantly: are we sure this actually operates the way we should? i.e. PSI is really what should be watched. It is not interesting who uses how much memory and triggering kills on that. What matters is to detect when the system becomes slow due to that, i.e. *latencies* introduced due to memory pressure and that's what PSI is about, and hence what should be used. But even if we'd ignore that in order fight latencies one should watch latencies: OOM killing per process is just not appropriate on a systemd system: all our system services (and a good chunk of our user services too) are sorted neatly into cgroups, and we really should kill them as a whole and not just individual processes inside them. systemd manages that today, and makes exceptions configurable via OOMPolicy=, and with your earlyoom stuff you break that. This looks like second guessing the kernel memory management folks at a place where one can only lose, and at the time breaking correct OOM reporting by the kernel via cgroups and stuff. Also: what precisely is this even supposed to do? Replace the algorithm for detecting *when* to go on a kill rampage? Or actually replace the algorithm selecting *what* to kill during a kill rampage? If it's the former (which the name of the project suggests, _early_oom)), then at the most basic the tool should let the kernel do the killing, i.e. "echo f > /proc/sysrq-trigger". That way the reporting via cgroups isn't fucked, and systemd can still do its thing, and the kernel can kill per cgroup rather than per process... Anyway, this all sounds very very fishy to me. Not thought to the end, and I am pretty sure this is something the kernel memory management folks should give a blessing to. Second guessing the kernel like that is just a bad idea if you ask me. I mean, yes, the OOM killer might not be that great currently, but this sounds like something to fix in kernel land, and if that doesn't work out for some reason because kernel devs can't agree, then do it as fallback in userspace, but with sound input from the kernel folks, and the blessing of at least some of the kernel folks. Lennart -- Lennart Poettering, Berlin _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx