On Wed 21-04-21 17:50:06, Muchun Song wrote: > On Wed, Apr 21, 2021 at 3:34 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Wed 21-04-21 14:26:44, Muchun Song wrote: > > > The below scenario can cause the page counters of the root_mem_cgroup > > > to be out of balance. > > > > > > CPU0: CPU1: > > > > > > objcg = get_obj_cgroup_from_current() > > > obj_cgroup_charge_pages(objcg) > > > memcg_reparent_objcgs() > > > // reparent to root_mem_cgroup > > > WRITE_ONCE(iter->memcg, parent) > > > // memcg == root_mem_cgroup > > > memcg = get_mem_cgroup_from_objcg(objcg) > > > // do not charge to the root_mem_cgroup > > > try_charge(memcg) > > > > > > obj_cgroup_uncharge_pages(objcg) > > > memcg = get_mem_cgroup_from_objcg(objcg) > > > // uncharge from the root_mem_cgroup > > > page_counter_uncharge(&memcg->memory) > > > > > > This can cause the page counter to be less than the actual value, > > > Although we do not display the value (mem_cgroup_usage) so there > > > shouldn't be any actual problem, but there is a WARN_ON_ONCE in > > > the page_counter_cancel(). Who knows if it will trigger? So it > > > is better to fix it. > > > > The changelog doesn't explain the fix and why you have chosen to charge > > kmem objects to root memcg and left all other try_charge users intact. > > The object cgroup is special (because the page can reparent). Only the > user of objcg APIs should be fixed. > > > The reason is likely that those are not reparented now but that just > > adds an inconsistency. > > > > Is there any reason you haven't simply matched obj_cgroup_uncharge_pages > > to check for the root memcg and bail out early? > > Because obj_cgroup_uncharge_pages() uncharges pages from the > root memcg unconditionally. Why? Because some pages can be > reparented to root memcg, in order to ensure the correctness of > page counter of root memcg. We have to uncharge pages from > root memcg. So we do not check whether the page belongs to > the root memcg when it uncharges. I am not sure I follow. Let me ask differently. Wouldn't you achieve the same if you simply didn't uncharge root memcg in obj_cgroup_charge_pages? Btw. which tree is this patch based on? The current linux-next doesn't uncharge from memcg->memory inside obj_cgroup_uncharge_pages (nor does the Linus tree). -- Michal Hocko SUSE Labs