Re: [PATCH 1/2] memcg: make oom_lock 0 and 1 based rather than coutner

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 14 Jul 2011 14:55:55 +0200
Michal Hocko <mhocko@xxxxxxx> wrote:

> On Thu 14-07-11 20:50:12, KAMEZAWA Hiroyuki wrote:
> > On Thu, 14 Jul 2011 13:30:09 +0200
> > Michal Hocko <mhocko@xxxxxxx> wrote:
> [...]
> > >  static bool mem_cgroup_oom_lock(struct mem_cgroup *mem)
> > >  {
> > > -	int x, lock_count = 0;
> > > -	struct mem_cgroup *iter;
> > > +	int x, lock_count = -1;
> > > +	struct mem_cgroup *iter, *failed = NULL;
> > > +	bool cond = true;
> > >  
> > > -	for_each_mem_cgroup_tree(iter, mem) {
> > > -		x = atomic_inc_return(&iter->oom_lock);
> > > -		lock_count = max(x, lock_count);
> > > +	for_each_mem_cgroup_tree_cond(iter, mem, cond) {
> > > +		x = !!atomic_add_unless(&iter->oom_lock, 1, 1);
> > > +		if (lock_count == -1)
> > > +			lock_count = x;
> > > +		else if (lock_count != x) {
> > > +			/*
> > > +			 * this subtree of our hierarchy is already locked
> > > +			 * so we cannot give a lock.
> > > +			 */
> > > +			lock_count = 0;
> > > +			failed = iter;
> > > +			cond = false;
> > > +		}
> > >  	}
> > 
> > Hm ? assuming B-C-D is locked and a new thread tries a lock on A-B-C-D-E.
> > And for_each_mem_cgroup_tree will find groups in order of A->B->C->D->E.
> > Before lock
> >   A  0
> >   B  1
> >   C  1
> >   D  1
> >   E  0
> > 
> > After lock
> >   A  1
> >   B  1
> >   C  1
> >   D  1
> >   E  0
> > 
> > here, failed = B, cond = false. Undo routine will unlock A.
> > Hmm, seems to work in this case.
> > 
> > But....A's oom_lock==0 and memcg_oom_wakeup() at el will not able to
> > know "A" is in OOM. wakeup processes in A which is waiting for oom recover..
> 
> Hohm, we need to have 2 different states. lock and mark_oom.
> oom_recovert would check only the under_oom.
> 

yes. I think so, too.

> > 
> > Will this work ?
> 
> No it won't because the rest of the world has no idea that A is
> under_oom as well.
> 
> > ==
> >  # cgcreate -g memory:A
> >  # cgset -r memory.use_hierarchy=1 A
> >  # cgset -r memory.oom_control=1   A
> >  # cgset -r memory.limit_in_bytes= 100M
> >  # cgset -r memory.memsw.limit_in_bytes= 100M
> >  # cgcreate -g memory:A/B
> >  # cgset -r memory.oom_control=1 A/B
> >  # cgset -r memory.limit_in_bytes=20M
> >  # cgset -r memory.memsw.limit_in_bytes=20M
> > 
> >  Assume malloc XXX is a program allocating XXX Megabytes of memory.
> > 
> >  # cgexec -g memory:A/B malloc 30  &    #->this will be blocked by OOM of group B
> >  # cgexec -g memory:A   malloc 80  &    #->this will be blocked by OOM of group A
> > 
> > 
> > Here, 2 procs are blocked by OOM. Here, relax A's limitation and clear OOM.
> > 
> >  # cgset -r memory.memsw.limit_in_bytes=300M A
> >  # cgset -r memory.limit_in_bytes=300M A
> > 
> >  malloc 80 will end.
> 
> What about yet another approach? Very similar what you proposed, I
> guess. Again not tested and needs some cleanup just to illustrate.
> What do you think?

Hmm, I think this will work. Please go ahead.
Unfortunately, I'll not be able to make a quick response for a week
because of other tasks. I'm sorry.

Anyway,
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> 

BTW, it's better to add "How-to-test" and the result in description.
Some test similar to mine will show the result we want.
==
Make a hierarchy of memcg, which has 300MB memory+swap limit.

 %cgcreate -g memory:A
 %cgset -r memory.limit_in_bytes=300M A
 %cgset -r memory.memsw.limit_in_bytes=300M A

Then, running folloing program under A.
 %cgexec -g memory:A ./fork
==
int main(int argc, char *argv[])
{
        int i;
        int status;

        for (i = 0; i < 5000; i++) {
                if (fork() == 0) {
                        char *c;
                        c = malloc(1024*1024);
                        memset(c, 0, 1024*1024);
                        sleep(20);
                        fprintf(stderr, "[%d]\n", i);
                        exit(0);
                }
                printf("%d\n", i);
                waitpid(-1, &status, WNOHANG);
        }
        while (1) {
                int ret;
                ret = waitpid(-1, &status, WNOHANG);

                if (ret == -1)
                        break;
                if (!ret)
                        sleep(1);
        }
        return 0;
}
==

Thank you for your effort.
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]