Re: [PATCH 4/6] mm,oom_reaper: Make OOM reaper use list of mm_struct.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
> On Tue 12-07-16 15:46:57, Michal Hocko wrote:
> > On Tue 12-07-16 22:38:42, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > >  #define MAX_OOM_REAP_RETRIES 10
> > > > > -static void oom_reap_task(struct task_struct *tsk)
> > > > > +static void oom_reap_task(struct task_struct *tsk, struct mm_struct *mm)
> > > > >  {
> > > > >  	int attempts = 0;
> > > > > -	struct mm_struct *mm = NULL;
> > > > > -	struct task_struct *p = find_lock_task_mm(tsk);
> > > > >  
> > > > >  	/*
> > > > > -	 * Make sure we find the associated mm_struct even when the particular
> > > > > -	 * thread has already terminated and cleared its mm.
> > > > > -	 * We might have race with exit path so consider our work done if there
> > > > > -	 * is no mm.
> > > > > +	 * Check MMF_OOM_REAPED in case oom_kill_process() found this mm
> > > > > +	 * pinned.
> > > > >  	 */
> > > > > -	if (!p)
> > > > > -		goto done;
> > > > > -	mm = p->mm;
> > > > > -	atomic_inc(&mm->mm_count);
> > > > > -	task_unlock(p);
> > > > > +	if (test_bit(MMF_OOM_REAPED, &mm->flags))
> > > > > +		return;
> > > > >  
> > > > >  	/* Retry the down_read_trylock(mmap_sem) a few times */
> > > > >  	while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task(tsk, mm))
> > > > >  		schedule_timeout_idle(HZ/10);
> > > > >  
> > > > >  	if (attempts <= MAX_OOM_REAP_RETRIES)
> > > > > -		goto done;
> > > > > +		return;
> > > > >  
> > > > >  	/* Ignore this mm because somebody can't call up_write(mmap_sem). */
> > > > >  	set_bit(MMF_OOM_REAPED, &mm->flags);
> > > > 
> > > > This seems unnecessary when oom_reaper always calls exit_oom_mm. The
> > > > same applies to __oom_reap_task. Which then means that the flag is
> > > > turning into a misnomer. MMF_SKIP_OOM would fit better its current
> > > > meaning.
> > > 
> > > Large oom_score_adj value or being a child process of highest OOM score
> > > might cause the same mm being selected again. I think these set_bit() are
> > > necessary in order to avoid the same mm being selected again.
> > 
> > I do not understand. Child will have a different mm struct from the
> > parent and I do not see how oom_score_adj is relevant here. Could you
> > elaborate, please?
> 
> OK, I guess I got your point. You mean we can select the same child/task
> again after it has passed its exit_oom_mm. Trying to oom_reap such a
> task would be obviously pointless. Then it would be better to stich that
> set_bit into exit_oom_mm. Renaming it would be also better in that
> context.
> 

Right.

oom_kill_process() receives a task which oom_badness() returned highest
score. But list_for_each_entry(child, &t->children, sibling) selects a
child process if that task has any OOM killable child process. The OOM
killer will kill that child process and the OOM reaper reaps memory from
that child process. But if the OOM reaper does not set MMF_OOM_REAPED after
reaping that child's memory, next round of the OOM killer will select that
child process again. It is possible that the child process was consuming
too little memory to solve the OOM situation.

Even if that task does not have any OOM killable child process, it is
possible that that task was consuming too little memory to solve the OOM
situation. We can misguide the OOM killer to select such process by
setting oom_score_adj to 1000.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]