On Tue, Jan 7, 2020 at 5:31 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Mon, Jan 06, 2020 at 10:41:22PM +0800, Yafang Shao wrote: > > On Mon, Jan 6, 2020 at 8:17 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > We can clean up the code a lot by getting rid of the unnecessary > > > indenting by doing this: > > > > > > /* iterate the global lru first */ > > > isolated = list_lru_walk_one(lru, nid, NULL, isolate, cb_arg, > > > nr_to_walk); > > > if (!list_lru_memcg_aware(lru)) > > > return isolated; > > > > > > nlru = &lru->node[nid]; > > > for_each_mem_cgroup(memcg) { > > > /* already scanned the root memcg above */ > > > if (is_root_memcg(memcg)) > > > continue; > > > if (*nr_to_walk <= 0) > > > break; > > > spin_lock(&nlru->lock); > > > isolated += __list_lru_walk_one(nlru, memcg, > > > isolate, cb_arg, > > > nr_to_walk); > > > spin_unlock(&nlru->lock); > > > } > > > return isolated; > > > > > > > That's eaiser to understand. > > I wil change it like this in next version. > > Thanks! > > > > > > > And so to architecture... This all makes me wonder why we still > > > special case the memcg LRU lists here. > > > > Can't agree more. > > The first time I read this code, I wondered why not assign an > > non-negtive number to kmemcg_id of the root_mem_cgroup and then use > > memcg_lrus as well. > > Yeah, I've been wondering why the ID is -1 instead of 0 when we > have a global variable that stores the root memcg that we can > compare pointers directly against via is_root_memcg(). all it seems > to do is make things more complex by having to special case the root > memcg.... > > From that perspective, I do like your change to use the memcg > iterator functions rather than a for loop over the range of indexes, > but I'd much prefer to see this method used consistently everywhere > rather than the way we've duplicated lots of code by tacking memcgs > onto the side of the non-memcg code paths. > Agreed, that would be better. I will think about it. > > > Ever since we went to > > > per-node memcg lru lists (~2015, iirc), there's been this special > > > hidden hack for the root memcg to use the global list. and one that > > > I have to read lots of code to remind myself it exists every time I > > > have to did into this code again. > > > > > > I mean, if the list is not memcg aware, then it only needs a single > > > list per node - the root memcg list. That could be easily stored in > > > the memcg_lrus array for the node rather than a separate global > > > list, and then the node iteration code all starts to look like this: > > > > > > nlru = &lru->node[nid]; > > > for_each_mem_cgroup(memcg) { > > > spin_lock(&nlru->lock); > > > isolated += __list_lru_walk_one(nlru, memcg, > > > isolate, cb_arg, > > > nr_to_walk); > > > spin_unlock(&nlru->lock); > > > if (*nr_to_walk <= 0) > > > break; > > > > > > /* non-memcg aware lists only have a root memcg list */ > > > if (!list_lru_memcg_aware(lru)) > > > break; > > > } > > > > > > Hence for the walks in the !list_lru_memcg_aware(lru) case, the > > > list_lru_from_memcg() call in __list_lru_walk_one() always returns > > > just the root list. This makes everything use the same iteration > > > interface, and when you configure out memcg then we simply code the > > > the iterator to run once and list_lru_from_memcg() always returns > > > the one list... > > > > > > > Agree with you that it is a better architecture, and I also want to > > change it like this. > > That would be more clear and easier for maintiance. > > But it requires lots of code changes, should we address it in another > > separate patchset ? > > Yes. I think this is a separate piece of work as it spans much more > than just the list-lru infrastructure. > Got it. > > > And, FWIW, I noticed that the list_lru memcg code assumes we only > > > ever put objects from memcg associated slab pages in the list_lru. > > > It calls memcg_from_slab_page() which makes no attempt to verify the > > > page is actually a slab page. That's a landmine just waiting to get > > > boom - list-lru can store any type of object the user wants, not > > > just slab pages. e.g. the binder code stores pages in the list-lru, > > > not slab objects, and so the only reason it doesn't go boom is that > > > the lru-list is not configured to be memcg aware.... > > > > > > > Another new issue. > > I will try to dignose what hiddened in this landmine is, and after I > > understand it clearly I will submit a new patchset. > > The problem is memcg_from_slab_page() uses page->slab_cache directly > to retreive the owner memcg without first checking the > PageSlab(page) flag. If it's not a slab page, we need to get the > memcg from page->memcg, not page->slab_cache->memcg_params.memcg. > > see page_cgroup_ino() for how to safely get the owner memcg from a > random page of unknown type... > Understood, many thanks for your explanation. Thanks Yafang