On Mon, Sep 14, 2015 at 04:07:32PM -0400, Tejun Heo wrote: > try_charge() is the main charging logic of memcg. When it hits the > limit but either can't fail the allocation due to __GFP_NOFAIL or the > task is likely to free memory very soon, being OOM killed, has SIGKILL > pending or exiting, it "bypasses" the charge to the root memcg and > returns -EINTR. While this is one approach which can be taken for > these situations, it has several issues. > > * It unnecessarily lies about the reality. The number itself doesn't > go over the limit but the actual usage does. memcg is either forced > to or actively chooses to go over the limit because that is the > right behavior under the circumstances, which is completely fine, > but, if at all avoidable, it shouldn't be misrepresenting what's > happening by sneaking the charges into the root memcg. > > * Despite trying, we already do over-charge. kmemcg can't deal with > switching over to the root memcg by the point try_charge() returns > -EINTR, so it open-codes over-charing. > > * It complicates the callers. Each try_charge() user has to handle > the weird -EINTR exception. memcg_charge_kmem() does the manual > over-charging. mem_cgroup_do_precharge() performs unnecessary > uncharging of root memcg, which BTW is inconsistent with what > memcg_charge_kmem() does but not broken as [un]charging are noops on > root memcg. mem_cgroup_try_charge() needs to switch the returned > cgroup to the root one. > > The reality is that in memcg there are cases where we are forced > and/or willing to go over the limit. Each such case needs to be > scrutinized and justified but there definitely are situations where > that is the right thing to do. We alredy do this but with a > superficial and inconsistent disguise which leads to unnecessary > complications. > > This patch updates try_charge() so that it over-charges and returns 0 > when deemed necessary. -EINTR return is removed along with all > special case handling in the callers. > > While at it, remove the local variable @ret, which was initialized to > zero and never changed, along with done: label which just returned the > always zero @ret. > > v2: Minor update to patch description as per Vladimir. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Reviewed-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> > Acked-by: Michal Hocko <mhocko@xxxxxxxx> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>