Re: [RFC][PATCH 8/9 v2] cgroup: avoid creating new cgroup under a cgroup being destroyed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 28, 2012 at 11:00 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hi, KAME.
>
> On Sat, Apr 28, 2012 at 09:20:52AM +0900, Hiroyuki Kamezawa wrote:
>> What I thought was...
>> Assume a memory cgoup A, with use_hierarchy==1.
>>
>> 1.  thread:0   start calling pre->destroy of cgroup A
>> 2.  thread:0   it sometimes calls cond_resched or other sleep functions.
>> 3.  thread:1   create a cgroup B under "A"
>> 4.  thread:1   attach a thread X to cgroup A/B
>> 5.  res_counter of A charged up. but pre_destroy() can't find what happens
>>     because it scans LRU of A.
>>
>> So, we have -EBUSY now. I considered some options to fix this.
>>
>> option 1) just return 0 instead of -EBUSY when pre_destroy() finds a
>> task or a child.
>>
>> There is a race....even if we return 0 here and expects cgroup code
>> can catch it,
>> the thread or a child we found may be moved to other cgroup before we check it
>> in cgroup's final check.
>> In that case, the cgroup will be freed before full-ack of
>> pre_destory() and the charges
>> will be lost.
>
> So, cgroup code won't proceed with rmdir if children are created
> inbetween and note that the race condition of lost charge you
> described above existed before this change - ie. new cgroup could be
> created after pre_destroy() is complete.
>
> The current cgroup rmdir code is transitional.  It has to support both
> retrying and non-retrying pre_destroy()s and that means we can't mark
> the cgroup DEAD before starting invoking pre_destroy(); however, we
> can do that once memcg's pre_destroy() is converted which will also
> remove all the WAIT_ON_RMDIR mechanism and the above described race.
>
> There really isn't much point in trying to make the current cgroup
> rmdir behave perfectly when the next step is removing all the fixed up
> parts.
>
> So, IMHO, just making pre_destroy() clean up its own charges and
> always returning 0 is enough.  There's no need to fix up old
> non-critical race condition at this point in the patch stream.  cgroup
> rmdir simplification will make them disappear anyway.
>
So, hmm, ok. I'll drop patch 7 & 8. memcg may return -EBUSY in very very
race case but users will not see it in the most case.
I'll fix limit, move-charge and use_hierarchy problem first.
Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]