Re: [PATCH 3/6] mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 30-05-16 13:26:44, Vladimir Davydov wrote:
> On Mon, May 30, 2016 at 11:39:50AM +0200, Michal Hocko wrote:
[...]
> > Yes and that leads me to a suspicion that we can do that. Maybe I should
> > just add a note into the log that we are doing that so that people can
> > complain? Something like the following
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index fa0b3ca94dfb..7f3495415719 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1104,7 +1104,6 @@ static int __set_oom_adj(struct file *file, int oom_adj, bool legacy)
> >  err_sighand:
> >  	unlock_task_sighand(task, &flags);
> >  err_put_task:
> > -	put_task_struct(task);
> >  
> >  	if (mm) {
> >  		struct task_struct *p;
> > @@ -1113,6 +1112,10 @@ static int __set_oom_adj(struct file *file, int oom_adj, bool legacy)
> >  		for_each_process(p) {
> >  			task_lock(p);
> >  			if (!p->vfork_done && process_shares_mm(p, mm)) {
> > +				pr_info("updating oom_score_adj for %d (%s) from %d to %d because it shares mm with %d (%s). Report if this is unexpected.\n",
> > +						task_pid_nr(p), p->comm,
> > +						p->signal->oom_score_adj, oom_adj,
> > +						task_pid_nr(task), task->comm);
> 
> IMO this could be acceptable from userspace pov, but I don't very much
> like how vfork is special-cased here and in oom killer code.

Well, the vfork has to be special cased here. We definitely have to
support
	vfork()
	set_oom_score_adj()
	exec()

use case. And I do not see other way without adding something to the
clone hot paths which sounds like not justifiable considering we are
talking about a really rare usecase that basically nobody cares about.
 
[...]
> > so one process would want to be always selected while the other one
> > doesn't want to get killed. All they can see is that everything is
> > put in place until the oom killer comes over and ignores that.
> 
> If we stored minimal oom_score_adj in mm struct, oom killer wouldn't
> kill any of these processes, and it looks fine to me as long as we want
> oom killer to be mm based, not task or signal_struct based.
> 
> Come to think of it, it'd be difficult to keep mm->oom_score_adj in sync
> with p->signal->oom_score_adj, because we would need to update
> mm->oom_score_adj not only on /proc write, but also on fork. May be, we
> could keep all signal_structs sharing mm linked in per mm list so that
> we could quickly update mm->oom_score_adj on fork? That way we wouldn't
> need to special case vfork.

Yes the current approach is slightly racy but I do not see that would
matter all that much. What you are suggesting might work but I am not
really sure we want to complicate the whole thing now. Sure if we see
that those races are real we can try to find a better solution, but I
would like to start as easy as possible and placing all the logic into
the oom_score_adj proc handler sounds like a good spot to me.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]