Re: Enable EarlyOOM on Fedora KDE - Fedora 33 Self-Contained Change proposal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>If both RAM and swap go below 10% free, earlyoom issues SIGTERM to the process with the largest oom_score. If both RAM and swap go below 5% free, earlyoom issues SIGKILL

Fedora's earlyoom package is provided with the changed default settings:

```
EARLYOOM_ARGS="-r 0 -m 4 -M 409600 --prefer '^Web Content$' --avoid '^(dnf|packagekitd|gnome-shell|gnome-session-c|gnome-session-b|lightdm|sddm|sddm-helper|gdm|gdm-wayland-ses|gdm-session-wor|gdm-x-session|Xorg|Xwayland|systemd|systemd-logind|dbus-daemon|dbus-broker|cinnamon|cinnamon-sessio|kwin_x11|kwin_wayland|plasmashell|ksmserver|plasma_session|startplasma-way|xfce4-session|mate-session|marco|lxqt-session|openbox)$'"
```

It means that:

1. SIGTERM threshold for MemAvailable is 4% (but not more than 400 MiB) and SIGKILL threshold for MemAvailable is 2% (but not more than 200 MiB) by default. The change was due to the fact that earlyoom tree was criticized for too aggressive thresholds by default, and this was taken into account. Please update description in the proposal.

2. Firefox's children processes "Web Content" gets +300 to its oom_score. It means that earlyoom will prefer to kill firefox tabs rather than entire browser. Similar behavior is already practiced in chromium and electron-based apps by default.

3. Processes, the killing of which can lead to the killing of the entire session (kwin_x11|kwin_wayland|plasmashell|ksmserver|plasma_session etc), receive reduced priority in choosing a victim. dnf also gets low prio. This is yet another advantage that you can mention in the proposal.

see https://pagure.io/fedora-workstation/issue/119#comment-638366

вт, 30 июн. 2020 г. в 22:26, Ben Cotton <bcotton@xxxxxxxxxx>:
https://fedoraproject.org/wiki/Changes/KDEEarlyOOM

== Summary ==
As [[Changes/EnableEarlyoom|Fedora Workstation did in F32]], install earlyoom package, and enable it by default. If both RAM and swap go below 10% free, earlyoom issues SIGTERM to the process with the largest oom_score. If both RAM and swap go below 5% free, earlyoom issues SIGKILL to the process with the largest oom_score. The idea is to recover from out of memory situations sooner, rather than the typical complete system hang in which the user has no other choice but to force power off.

== Owner ==
* Name: [[User:bcotton|Ben Cotton]]
* Email: bcotton@xxxxxxxxxx

== Detailed Description ==
Shamelessly copied from Workstation, which did it in the last release:

Certain workloads have heavy memory demands, quickly consume all of RAM, and start to heavily page out to swap. (Heavy paging, is often called "swap thrashing" for added descriptive effect, probably because it's noticeable and annoying). Incidental swap usage is a good thing, it frees up memory for active pages used by a process. Heavy swap usage quickly leads to a very negative UX, because it's slow, even on modern SSDs. Due to installer defaults, the swap partition is made the same size as available memory (at install time), which can be huge. This just extends swap thrashing time.

On the one hand, we want this resource hungry job to complete. On the other hand, we want our system to be responsive while that other work is going on. But once the GUI stutters or even comes to an apparent stand still (hang), we're really wishing the kernel oom-killer would kick in and free up memory, so we can start over (maybe using memory or thread limiting options - which arguably should be more intelligently figured out, and that too is a work in progress but beyond the scope of this feature).

However, once in a heavy swap scenario, it's relatively common the system gets stuck in it, where GUI interactivity is terrible to non-existent, and also the kernel oom-killer doesn't trigger. From a certain point of view, this is working as intended. The kernel oom-killer is concerned about keeping the kernel running. It's not at all concerned about user space responsiveness.

Instead of the system becoming completely unresponsive for tens of minutes, hours or days, this feature expects that an offending process (determined by oom_score, same as the kernel oom-killer) will be killed off within seconds or a few minutes.

== Benefit to Fedora ==

KDE users will be able to take advantage of the benefits Workstation users got from enabling earlyOOM in Fedora 32:

* improved user experience by more quickly regaining control over one's system, rather than having to force power off in low-memory situations where there's aggressive swapping. Once a system becomes unresponsive, it's completely reasonable for the user to assume the system is lost, but that includes high potential for data loss.
* reducing forced poweroff as the main work around will increase data collection, improving understanding of low memory situations and how to handle them better
* earlyoom first sends SIGTERM to the chosen process, so it has a chance of a proper shutdown, unlike the kernel's oom-killer

== Scope ==
* Proposal owners:
** Modify {{code|https://pagure.io/fedora-comps/blob/master/f/comps-f33.xml.in}} to include earlyoom package for in {{code|kde-desktop}} section.
** Add {{code|https://src.fedoraproject.org/rpms/fedora-release/blob/master/f/80-kde.preset}} to include:
<pre>
# enable earlyoom by default on KDE
enable earlyoom.service
</pre>

* Other developers: None, unless KDE-based Spins/Labs want to opt out

* Release engineering: N/A
* Policies and guidelines: N/A
* Trademark approval: N/A

== Upgrade/compatibility impact ==
earlyoom.service will be enabled on upgrade. An upgraded system should exhibit the same behaviors as a newly-installed system.

== How To Test ==
* Fedora 31/32 KDE users can test today:
** {{code|sudo dnf install earlyoom}}
** {{code|sudo systemctl enable --now earlyoom}}

And then attempt to cause an out of memory situation. Examples:
** {{code|tail /dev/zero}}
** https://lkml.org/lkml/2019/8/4/15

== User Experience ==
earlyoom sends SIGTERM to processes based on oom_score when both memory and swap have less than 10% free and SIGKILL when below 5%.

== Dependencies ==
None

== Contingency Plan ==

* Contingency mechanism: (What to do?  Who will do it?) Owner reverts changes
* Contingency deadline: Final freeze
* Blocks release? No

== Documentation ==
* {{code|man earlyoom}}
* https://github.com/rfjakob/earlyoom
* https://www.kernel.org/doc/gorman/html/understand/understand016.html

== Release Notes ==
The earlyoom service is now enabled by default in Fedora KDE.

The earlyoom service monitors system memory usage. If free memory falls below a set limit, earlyoom terminates an appropriate process to free up memory. As a result, the system does not become unresponsive for long periods of time in low-memory situations.

The following is the default earlyoom configuration:

* If both RAM and swap go below 10% free, earlyoom sends the SIGTERM signal to the process with the largest oom_score.
* If both RAM and swap go below 5% free, earlyoom sends the SIGKILL signal to the process with the largest oom_score.

For more information, see the earlyoom man page.

--
Ben Cotton
He / Him / His
Senior Program Manager, Fedora & CentOS Stream
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux