(cc'ing memcg folks for visiblity) On Mon, Aug 22, 2022 at 08:04:02AM -0400, Chris Frey wrote: > In cgroups v1 we had: > > memory.soft_limit_in_bytes > memory.limit_in_bytes > memory.memsw.limit_in_bytes > memory.oom_control > > Using these features, we could achieve: > > - cause programs that were memory hungry to suffer performance, but > not stop (soft limit) > > - cause programs to swap before the system actually ran out of memory > (limit) > > - cause programs to be OOM-killed if they used too much swap > (memsw.limit...) > > - cause programs to halt instead of get killed (oom_control) > > That last feature is something I haven't seen duplicated in the settings > for cgroups v2. In terms of handling a truly non-malicious memory hungry > program, it is a feature that has no equal, because the user may require > time to free up memory elsewhere before allocating more to the program, > and he may not want the performance degredation, nor the loss of work, > that comes from the other options. > > Is there a reason why it wasn't included in v2? Is there hope that it will > come back? memcg folks will have better answers but the short answer is that the kernel really doesn't like giving control of a task stuck with an arbitrary backtrace to userspace, and that kernel OOM detection often is way too late, so cgroup2 instead goes for enabling userspace-drive OOM detection and handling through PSI. The following doc has some information on it. https://facebookmicrosites.github.io/resctl-demo-website/docs/demo_docs/res_protection/oomd-daemon FYI, systemd already has its own oomd implementation in systemd-oomd. Thanks. -- tejun