On 2019/07/24 17:07, Michal Hocko wrote: > On Wed 24-07-19 16:37:35, Tetsuo Handa wrote: >> On 2019/07/24 15:41, Michal Hocko wrote: > [...] >>> That being said, I do not think this patch gives any improvement. >>> >> >> This patch avoids RCU during select_bad_process(). > > It just shifts where the RCU is taken. Do you have any numbers to show > that this is an improvement? Basically the only potentially expensive > thing down the oom_evaluate_task that I can see is the task_lock but I > am not aware of a single report that this would be a contributor for RCU > stalls. I can be proven wrong but > I don't have numbers (nor intent to show numbers). What I said is "we can do reschedulable things from select_bad_process() if future development found that it is nice to do, for oom_evaluate_task() is called without RCU". For now just cond_resched() would be added into select_bad_process() iteration. >> This patch allows >> possibility of doing reschedulable things there; e.g. directly reaping >> only a portion of OOM victim's memory rather than wasting CPU resource >> by spinning until MMF_OOM_SKIP is set by the OOM reaper. > > We have been through direct oom reaping before and I haven't changed my > possition there. It is just too tricky to be worth it. > Not limited to direct OOM reaping. Anything that future development would find. Anyway, traversing only once (by this patch) allows showing consistent snapshot of OOM victim candidates. In other words, this patch makes sure that OOM victim candidates shown by dump_tasks() are what select_bad_process() has evaluated, for you said that the main purpose of the listing is to double check the list to understand the OOM victim selection. This patch removes race window of adding or removing OOM victim candidates.