On Thu, Jan 6, 2022 at 7:00 PM Michal Koutný <mkoutny@xxxxxxxx> wrote: > > On Mon, Dec 20, 2021 at 04:56:43PM +0800, Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: > (Thanks for pointing me here.) > > > -void memcg_drain_all_list_lrus(int src_idx, struct mem_cgroup *dst_memcg) > > +void memcg_drain_all_list_lrus(struct mem_cgroup *src, struct mem_cgroup *dst) > > { > > + struct cgroup_subsys_state *css; > > struct list_lru *lru; > > + int src_idx = src->kmemcg_id; > > + > > + /* > > + * Change kmemcg_id of this cgroup and all its descendants to the > > + * parent's id, and then move all entries from this cgroup's list_lrus > > + * to ones of the parent. > > + * > > + * After we have finished, all list_lrus corresponding to this cgroup > > + * are guaranteed to remain empty. So we can safely free this cgroup's > > + * list lrus in memcg_list_lru_free(). > > + * > > + * Changing ->kmemcg_id to the parent can prevent memcg_list_lru_alloc() > > + * from allocating list lrus for this cgroup after memcg_list_lru_free() > > + * call. > > + */ > > + rcu_read_lock(); > > + css_for_each_descendant_pre(css, &src->css) { > > + struct mem_cgroup *memcg; > > + > > + memcg = mem_cgroup_from_css(css); > > + memcg->kmemcg_id = dst->kmemcg_id; > > + } > > + rcu_read_unlock(); > > Do you envision using this function anywhere else beside offlining? > If not, you shouldn't need traversing whole subtree because normally > parents are offlined only after children (see cgroup_subsys_state.online_cnt). > > > > > mutex_lock(&list_lrus_mutex); > > list_for_each_entry(lru, &memcg_list_lrus, list) > > - memcg_drain_list_lru(lru, src_idx, dst_memcg); > > + memcg_drain_list_lru(lru, src_idx, dst); > > mutex_unlock(&list_lrus_mutex); > > If you do, then here you only drain list_lru of the subtree root but not > the descendants anymore. > > So I do get that mem_cgroup.kmemcg_id refernces the "effective" > kmemcg_id after offlining, so that proper list_lrus are used afterwards. > > I wonder -- is this necessary when objcgs are reparented too? IOW would > anyone query the offlined child's kmemcg_id? > (Maybe it's worth explaining better in the commit message, I think even > current approach is OK (better safe than sorry).) > Sorry for the late reply. Image a bunch of memcg as follows. C's parent is B, B's parent is A and A's parent is root. The numbers in parentheses are their kmemcg_id respectively. root(-1) -> A(0) -> B(1) -> C(2) CPU0: CPU1: memcg_list_lru_alloc(C) memcg_drain_all_list_lrus(C) memcg_drain_all_list_lrus(B) // Now C and B are offline. The // kmemcg_id becomes the following if // we do not the kmemcg_id of its // descendants in // memcg_drain_all_list_lrus(). // // root(-1) -> A(0) -> B(0) -> C(1) for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) { // allocate struct list_lru_per_memcg for memcg C table[i].mlru = memcg_init_list_lru_one(gfp); } spin_lock_irqsave(&lru->lock, flags); while (i--) { // here index = 1 int index = table[i].memcg->kmemcg_id; struct list_lru_per_memcg *mlru = table[i].mlru; if (index < 0 || rcu_dereference_protected(mlrus->mlru[index], true)) kfree(mlru); else // mlrus->mlru[index] will be assigned a new value regardless // memcg C is already offline. rcu_assign_pointer(mlrus->mlru[index], mlru); } spin_unlock_irqrestore(&lru->lock, flags); So changing ->kmemcg_id of all its descendants can prevent memcg_list_lru_alloc() from allocating list lrus for the offlined cgroup after memcg_list_lru_free() calling. Thanks.