On Mon 29-08-22 12:40:05, Michal Hocko wrote: > On Sun 28-08-22 13:50:09, Yu Zhao wrote: > > On Tue, Aug 23, 2022 at 2:36 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > [...] > > > You cannot really make any > > > assumptions about oom_reaper and how quickly it is going to free the > > > memory. > > > > Agreed. But here we are talking about heuristics, not dependencies on > > certain behaviors. Assume we are playing a guessing game: there are > > multiple mm_structs available for reclaim, would the oom-killed ones > > be more profitable on average? I'd say no, because I assume it's more > > likely than unlikely that the oom reaper is doing/to do its work. Note > > that the assumption is about likelihood, hence arguably valid. > > Well, my main counter argument would be that we do not really want to > carve last resort mechanism (which the oom reaper is) into any heuristic > because any future changes into that mechanism will be much harder to > justify and change. There is a cost of the maintenance that should be > considered. While you might be right that this change would be > beneficial, there is no actual proof of that. Historically we've had > several examples of such a behavior which was really hard to change > later on because the effect would be really hard to evaluate. Forgot to mention the recent change as a clear example of the change which would be have a higher burden to evaluate. e4a38402c36e ("oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup") has changed the wake up logic to be triggered after a timeout. This means that the task will be sitting there on the queue without any actual reclaim done on it. The timeout itself can be changed in the future and I would really hate to argue that changeing it from $FOO to $FOO + epsilon breaks a very subtle dependency somewhere deep in the reclaim path. From the oom reaper POV any timeout is reasonable becaude this is the _last_ resort to resolve OOM stall/deadlock when the victim cannot exit on its own for whatever reason. This is a considerably different objective from "we want to optimize which taks to scan to reclaim efficiently". See my point? -- Michal Hocko SUSE Labs