On Sat, Apr 28, 2012 at 11:00 AM, Tejun Heo <tj@xxxxxxxxxx> wrote: > Hi, KAME. > > On Sat, Apr 28, 2012 at 09:20:52AM +0900, Hiroyuki Kamezawa wrote: >> What I thought was... >> Assume a memory cgoup A, with use_hierarchy==1. >> >> 1. thread:0 start calling pre->destroy of cgroup A >> 2. thread:0 it sometimes calls cond_resched or other sleep functions. >> 3. thread:1 create a cgroup B under "A" >> 4. thread:1 attach a thread X to cgroup A/B >> 5. res_counter of A charged up. but pre_destroy() can't find what happens >> because it scans LRU of A. >> >> So, we have -EBUSY now. I considered some options to fix this. >> >> option 1) just return 0 instead of -EBUSY when pre_destroy() finds a >> task or a child. >> >> There is a race....even if we return 0 here and expects cgroup code >> can catch it, >> the thread or a child we found may be moved to other cgroup before we check it >> in cgroup's final check. >> In that case, the cgroup will be freed before full-ack of >> pre_destory() and the charges >> will be lost. > > So, cgroup code won't proceed with rmdir if children are created > inbetween and note that the race condition of lost charge you > described above existed before this change - ie. new cgroup could be > created after pre_destroy() is complete. > > The current cgroup rmdir code is transitional. It has to support both > retrying and non-retrying pre_destroy()s and that means we can't mark > the cgroup DEAD before starting invoking pre_destroy(); however, we > can do that once memcg's pre_destroy() is converted which will also > remove all the WAIT_ON_RMDIR mechanism and the above described race. > > There really isn't much point in trying to make the current cgroup > rmdir behave perfectly when the next step is removing all the fixed up > parts. > > So, IMHO, just making pre_destroy() clean up its own charges and > always returning 0 is enough. There's no need to fix up old > non-critical race condition at this point in the patch stream. cgroup > rmdir simplification will make them disappear anyway. > So, hmm, ok. I'll drop patch 7 & 8. memcg may return -EBUSY in very very race case but users will not see it in the most case. I'll fix limit, move-charge and use_hierarchy problem first. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href