On Wed, Apr 21, 2021 at 9:03 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Wed 21-04-21 17:50:06, Muchun Song wrote: > > On Wed, Apr 21, 2021 at 3:34 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Wed 21-04-21 14:26:44, Muchun Song wrote: > > > > The below scenario can cause the page counters of the root_mem_cgroup > > > > to be out of balance. > > > > > > > > CPU0: CPU1: > > > > > > > > objcg = get_obj_cgroup_from_current() > > > > obj_cgroup_charge_pages(objcg) > > > > memcg_reparent_objcgs() > > > > // reparent to root_mem_cgroup > > > > WRITE_ONCE(iter->memcg, parent) > > > > // memcg == root_mem_cgroup > > > > memcg = get_mem_cgroup_from_objcg(objcg) > > > > // do not charge to the root_mem_cgroup > > > > try_charge(memcg) > > > > > > > > obj_cgroup_uncharge_pages(objcg) > > > > memcg = get_mem_cgroup_from_objcg(objcg) > > > > // uncharge from the root_mem_cgroup > > > > page_counter_uncharge(&memcg->memory) > > > > > > > > This can cause the page counter to be less than the actual value, > > > > Although we do not display the value (mem_cgroup_usage) so there > > > > shouldn't be any actual problem, but there is a WARN_ON_ONCE in > > > > the page_counter_cancel(). Who knows if it will trigger? So it > > > > is better to fix it. > > > > > > The changelog doesn't explain the fix and why you have chosen to charge > > > kmem objects to root memcg and left all other try_charge users intact. > > > > The object cgroup is special (because the page can reparent). Only the > > user of objcg APIs should be fixed. > > > > > The reason is likely that those are not reparented now but that just > > > adds an inconsistency. > > > > > > Is there any reason you haven't simply matched obj_cgroup_uncharge_pages > > > to check for the root memcg and bail out early? > > > > Because obj_cgroup_uncharge_pages() uncharges pages from the > > root memcg unconditionally. Why? Because some pages can be > > reparented to root memcg, in order to ensure the correctness of > > page counter of root memcg. We have to uncharge pages from > > root memcg. So we do not check whether the page belongs to > > the root memcg when it uncharges. > > I am not sure I follow. Let me ask differently. Wouldn't you > achieve the same if you simply didn't uncharge root memcg in > obj_cgroup_charge_pages? I'm afraid not. Some pages should uncharge root memcg, some pages should not uncharge root memcg. But all those pages belong to the root memcg. We cannot distinguish between the two. I believe Roman is very familiar with this mechanism (objcg APIs). Hi Roman, Any thoughts on this? > > Btw. which tree is this patch based on? The current linux-next doesn't > uncharge from memcg->memory inside obj_cgroup_uncharge_pages (nor does > the Linus tree). Sorry. I should expose more details. obj_cgroup_uncharge_pages refill_stock->drain_stock page_counter_uncharge // uncharging is here Thanks. > -- > Michal Hocko > SUSE Labs