On Mon 31-07-23 14:00:22, Chuyi Zhou wrote: > Hello, Michal > > 在 2023/7/28 01:23, Michal Hocko 写道: [...] > > This sounds like a very specific oom policy and that is fine. But the > > interface shouldn't be bound to any concepts like priorities let alone > > be bound to memcg based selection. Ideally the BPF program should get > > the oom_control as an input and either get a hook to kill process or if > > that is not possible then return an entity to kill (either process or > > set of processes). > > Here are two interfaces I can think of. I was wondering if you could give me > some feedback. > > 1. Add a new hook in select_bad_process(), we can attach it and return a set > of pids or cgroup_ids which are pre-selected by user-defined policy, > suggested by Roman. Then we could use oom_evaluate_task to find a final > victim among them. It's user-friendly and we can offload the OOM policy to > userspace. > > 2. Add a new hook in oom_evaluate_task() and return a point to override the > default oom_badness return-value. The simplest way to use this is to protect > certain processes by setting the minimum score. > > Of course if you have a better idea, please let me know. Hooking into oom_evaluate_task seems the least disruptive to the existing oom killer implementation. I would start by planing with that and see whether useful oom policies could be defined this way. I am not sure what is the best way to communicate user input so that a BPF prgram can consume it though. The interface should be generic enough that it doesn't really pre-define any specific class of policies. Maybe we can add something completely opaque to each memcg/task? Does BPF infrastructure allow anything like that already? -- Michal Hocko SUSE Labs