Hello, guys. On Wed, Feb 05, 2014 at 10:28:21AM -0500, Johannes Weiner wrote: > I thought more about this and talked to Tejun as well. He told me > that the rcu grace period between disabling tryget and calling > css_offline() is currently an implementation detail of the refcounter > that css uses, but it's not a guarantee. So my initial idea of Yeah, that's an implementation detail coming from how percpu_ref is implemented at the moment. Also, it's a sched_rcu grace period, not a normal one. The only RCU-related guarnatee that cgroup core gives is that there will be a full RCU grace period between css's ref reaching zero and invocation of ->css_free() so that it's safe to do css_tryget() inside RCU critical sections. In short, offlining is *not* protected by RCU. Freeing is. > Well, css_free() is the callback invoked when the ref counter hits 0, > and that is a guarantee. From a memcg perspective, it's the right > place to do reparenting, not css_offline(). So, css_offline() is cgroup telling controllers two things. * The destruction of the css, which will commence when css ref reaches zero, has initiated. If you're holding any long term css refs for caching and stuff, put them so that destruction can actually happen. * Any css_tryget() attempts which haven't finished yet are guaranteed to fail. (there's no implied RCU protection here) Maybe offline is a bit of misnomer. It's really just telling the controllers to get prepared to be destroyed. > Here is the only exception to the above: swapout records maintain > permanent css references, so they prevent css_free() from running. > For that reason alone we should run one optimistic reparenting in > css_offline() to make sure one swap record does not pin gigabytes of > pages in an offlined cgroup, which is unreachable for reclaim. But > the reparenting for *correctness* is in css_free(), not css_offline(). A more canonical use case can be found in blkcg. blkcg holds "cache" css refs for optimization in the indexing data structure. On offline, blkcg purges those refs so that those stale cache refs don't put off actual destruction for too long. But yeah the above sounds like a plausible use case too. Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>