On Fri 01-02-19 05:59:55, Tetsuo Handa wrote: > On 2019/01/31 16:11, Michal Hocko wrote: > > On Thu 31-01-19 07:49:35, Tetsuo Handa wrote: > >> This patch reverts both commit 44a70adec910d692 ("mm, oom_adj: make sure > >> processes sharing mm have same view of oom_score_adj") and commit > >> 97fd49c2355ffded ("mm, oom: kill all tasks sharing the mm") in order to > >> close a race and reduce the latency at __set_oom_adj(), and reduces the > >> warning at __oom_kill_process() in order to minimize the latency. > >> > >> Commit 36324a990cf578b5 ("oom: clear TIF_MEMDIE after oom_reaper managed > >> to unmap the address space") introduced the worst case mentioned in > >> 44a70adec910d692. But since the OOM killer skips mm with MMF_OOM_SKIP set, > >> only administrators can trigger the worst case. > >> > >> Since 44a70adec910d692 did not take latency into account, we can "hold RCU > >> for minutes and trigger RCU stall warnings" by calling printk() on many > >> thousands of thread groups. Also, current code becomes a DoS attack vector > >> which will allow "stalling for more than one month in unkillable state" > >> simply printk()ing same messages when many thousands of thread groups > >> tried to iterate __set_oom_adj() on each other. > >> > >> I also noticed that 44a70adec910d692 is racy [1], and trying to fix the > >> race will require a global lock which is too costly for rare events. And > >> Michal Hocko is thinking to change the oom_score_adj implementation to per > >> mm_struct (with shadowed score stored in per task_struct in order to > >> support vfork() => __set_oom_adj() => execve() sequence) so that we don't > >> need the global lock. > >> > >> If the worst case in 44a70adec910d692 happened, it is an administrator's > >> request. Therefore, before changing the oom_score_adj implementation, > >> let's eliminate the DoS attack vector first. > > > > This is really ridiculous. I have already nacked the previous version > > and provided two ways around. The simplest one is to drop the printk. > > The second one is to move oom_score_adj to the mm struct. Could you > > explain why do you still push for this? > > Dropping printk() does not close the race. But it does remove the source of a long operation from the RCU context. If you are not willing to post such a trivial patch I will do so. > You must propose an alternative patch if you dislike this patch. I will eventually get there. -- Michal Hocko SUSE Labs