Fedora 34 Change: Enable systemd-oomd by default for all variants (System-Wide Change)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://fedoraproject.org/wiki/Changes/EnableSystemdOomd

== Summary ==

Provide a better experience for Fedora users in out-of-memory (OOM)
situations by enabling
[https://www.freedesktop.org/software/systemd/man/systemd-oomd.html
systemd-oomd] by default. Actions taken by systemd-oomd operate on a
per-cgroup level, aligning well with the life cycle of systemd units.
systemd-oomd primarily uses [https://facebookmicrosites.github.io/psi/
Linux pressure stall information (PSI)] to make decisions based on
wasted productivity due to resource shortages; in addition to that, it
also supports swap based actions.

== Owners ==

* Name: [[User:anitazha|Anita Zhang]], [[User:Dcavalca|Davide
Cavalca]], [[User:Salimma|Michel Salim]], [[User:Htejun|Tejun Heo]],
[[User:3ki|Rik van Riel]]
* Email: the.anitazha@xxxxxxxxx, dcavalca@xxxxxx,
michel@xxxxxxxxxxxxxxx, htejun@xxxxxx, riel@xxxxxx


== Detailed description ==

The primary mechanism used by systemd-oomd for detecting when the
system is out of memory is memory pressure. Memory pressure measures
the percentage of time a cgroup has “wasted” due to lack of memory.
This includes time spent reclaiming free memory, faulting in recently
resident pages, and loading in anonymous pages from swap. When a
monitored cgroup’s memory pressure exceeds the specified thresholds,
systemd-oomd will perform action(s) on the targeted cgroup’s
descendants, starting from the cgroups with the most reclaim scans.
Reclaim activity is used here, rather than the largest consumer, as it
reflects values set in the cgroup memory controller for memory
protection (such as memory.low).

For memory pressure configuration, this will be
ManagedOOMMemoryPressure=kill and ManagedOOMMemoryPressureLimit=4% on
user@.service to have systemd-oomd send SIGKILLs to all processes
under a selected cgroup when total memory pressure on all tasks
exceeds 4% for 10 seconds.

For swap based actions, systemd-oomd will monitor the system-wide swap
space and act when available swap falls below the configured
threshold, starting with the cgroups with the highest swap usage to
the least. Keeping some amount of swap (if enabled) available will
prevent the kernel OOM killer from killing processes unpredictably and
spending an unbounded amount of time afterwards.

For swap configuration, this will be SwapUsedLimitPercent=90% in
oomd.conf and ManagedOOMSwap=kill on -.slice (root cgroup slice) to
have systemd-oomd send SIGKILLs to all processes under a cgroup when
swap used exceeds 90%.


== Benefit to Fedora ==

* Addressing the issue of improving user feedback in
https://pagure.io/fedora-workstation/issue/202, systemd-oomd currently
logs to the journal if pressure or swap action is about to occur.
There are also debug logs, for each process that is sent a SIGKILL,
that can be bumped up in priority. Further notification mechanisms
(i.e. over dbus) can also be implemented depending on feedback.
* While systemd-oomd is simpler in configuration to the oomd used at
Facebook, the algorithm is largely the same. As such, the following
case study can be used as an example of how PSI and cgroup killing can
release memory not normally resolved with process killing and lead to
better utilization:
https://facebookincubator.github.io/oomd/docs/oomd-casestudy.html
* OOM killing in userspace, before the kernel OOM killer kicks in, has
been shown to be effective at keeping a system functional. An OOM kill
in the kernel is slow, possibly leading to an unbounded amount of time
swapping in and out pages and evicting the page cache.
* PSI based actions, versus looking at raw memory consumption numbers,
better reflect memory protection policies set for cgroup resource
control limits (e.g. memory.low).

== Scope ==

* Proposal owners:
** Implement and land additional refinements to systemd-oomd
*** Remove swap as a hard requirement to running systemd-oomd
*** Expand ManagedOOM*= properties to user units (currently only
usable on system units)
*** Configurable memory pressure time window knob
** Enable oomd by default with sensible configuration
** Test days
** Aid with documentation
* Other developers:
** systemd: review PRs as needed
* Release engineering: https://pagure.io/releng/issue/9913
* Policies and guidelines: N/A
* Trademark approval: N/A

== Upgrade/compatibility impact ==

Existing systems running earlyoom will not be modified. One can
transition to systemd-oomd via:

<pre>sudo systemctl disable --now earlyoom
sudo systemctl enable --now systemd-oomd</pre>
Systems that were previously not running earlyoom will have
systemd-oomd enabled by default.

== How to test ==

systemd 247 build for Fedora includes all the artifacts for
systemd-oomd. It is disabled by default but can be started with:

<pre>sudo systemctl enable --now systemd-oomd</pre>
At this point you can decide which units to set properties on. For
example, to enable swap-based killing on all units below the root
slice:

<pre>sudo systemctl edit --force -- -.slice
[Slice]
ManagedOOMSwap=kill
# save and exit</pre>

Note that the following memory pressure example requires the changes
listed in “Scope” to work as expected, as systemd-oomd shipped with
systemd v247 does not support changing the time window for memory
pressure. This example was run on a system with swap:

<pre>systemctl edit user@.service
[Service]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=4%
# save and exit

systemd-run --user tail /dev/zero # will lead to a lot of reclaim and
then OOM if not killed</pre>

== User experience ==

This should be a fully transparent change for users.

== Dependencies ==

None. If changes to oomd are required to address feedback to this
proposal, they will need to be merged in systemd.

== Contingency plan ==

* Contingency mechanism: For workstation, owner will revert all
changes and we’ll go back to using earlyoom instead
* Contingency deadline: Final freeze
* Blocks release? No
* Blocks product? No

== Documentation ==

https://www.freedesktop.org/software/systemd/man/systemd-oomd.html<br />
https://www.freedesktop.org/software/systemd/man/oomctl.html<br />
https://www.freedesktop.org/software/systemd/man/oomd.conf.html

== Release Notes ==

systemd-oomd is enabled by default. Depending on which systemd units
have ManagedOOMSwap=kill or ManagedOOMMemoryPressure=kill,
systemd-oomd will SIGKILL all the processes under the appropriate
descendant cgroups when the configured limits are exceeded.

To revert back to earlyoom, run:

<pre>sudo systemctl disable --now systemd-oomd
sudo systemctl enable --now earlyoom</pre>
See man oomd.conf for configuration options.


-- 
Ben Cotton
He / Him / His
Senior Program Manager, Fedora & CentOS Stream
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux