On 2019/07/24 15:41, Michal Hocko wrote: > On Wed 24-07-19 12:54:36, Tetsuo Handa wrote: >> Currently out_of_memory() is full of get_task_struct()/put_task_struct() >> calls. Since "mm, oom: avoid printk() iteration under RCU" introduced >> a list for holding a snapshot of all OOM victim candidates, let's share >> that list for select_bad_process() and oom_kill_process() in order to >> simplify task's refcount handling. >> >> As a result of this patch, get_task_struct()/put_task_struct() calls >> in out_of_memory() are reduced to only 2 times respectively. > > This is probably a matter of taste but the diffstat suggests to me that the > simplification is not all that great. On the other hand this makes the > oom handling even more tricky and harder for potential further > development - e.g. if we ever need to break the global lock down in the > future this would be another obstacle on the way. If we want to remove oom_lock serialization, we can implement it by doing INIT_LIST_HEAD(&p->oom_candidate) upon creating a thread and checking list_empty(&p->oom_candidate) under p->task_lock (or something) held when adding to local on-stack "oom_candidate_list" list stored in "oc". But we do not want to jumble concurrent OOM killer messages. Since it is dump_header() which takes majority of time, synchronous printk() will be the real obstacle on the way. I've tried removing oom_lock serialization, and got commit cbae05d32ff68233 ("printk: Pass caller information to log_store()."). The OOM killer is calling printk() in a manner that will jumble concurrent OOM killer messages... > While potential > development might be too theoretical the benefit of the patch is not > really clear to me. The task_struct reference counting is not really > unusual operations and there is nothing really scary that we do with it > here. We already have to to extra mile wrt. task_lock so careful > reference count doesn't really jump out. > > That being said, I do not think this patch gives any improvement. > This patch avoids RCU during select_bad_process(). This patch allows possibility of doing reschedulable things there; e.g. directly reaping only a portion of OOM victim's memory rather than wasting CPU resource by spinning until MMF_OOM_SKIP is set by the OOM reaper.