Re: [PATCH RFC v2 0/5] mm: Select victim using bpf_oom_evaluate_task

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 16 Aug 2023 15:49:07 +0000

> Changes
> -------
>
> This is v2 of the BPF OOM policy patchset.
> v1 : https://lore.kernel.org/lkml/20230804093804.47039-1-zhouchuyi@xxxxxxxxxxxxx/
> v1 -> v2 changes:
>
> - rename bpf_select_task to bpf_oom_evaluate_task and bypass the
> tsk_is_oom_victim (and MMF_OOM_SKIP) logic. (Michal)
>
> - add a new hook to set policy's name, so dump_header() can know
> what has been the selection policy when reporting messages. (Michal)
>
> - add a tracepoint when select_bad_process() find nothing. (Alan)
>
> - add a doc to to describe how it is all supposed to work. (Alan)
>
> ================
>
> This patchset adds a new interface and use it to select victim when OOM
> is invoked. The mainly motivation is the need to customizable OOM victim
> selection functionality.
>
> The new interface is a bpf hook plugged in oom_evaluate_task. It takes oc
> and current task as parameters and return a result indicating which one is
> selected by the attached bpf program.
>
> There are several conserns when designing this interface suggested by
> Michal:
>
> 1. Hooking into oom_evaluate_task can keep the consistency of global and
> memcg OOM interface. Besides, it seems the least disruptive to the existing
> oom killer implementation.
>
> 2. Userspace can handle a lot on its own and provide the input to the BPF
> program to make a decision. Since the oom scope iteration will be
> implemented already in the kernel so all the BPF program has to do is to
> rank processes or memcgs.
>
> 3. The new interface should better bypass the current heuristic rules
> (e.g., tsk_is_oom_victim, and MMF_OOM_SKIP) to meet an arbitrary oom
> policy's need.

Can we linux-mm on such changes? I almost missed this series :)