2020년 3월 19일 (목) 오전 4:59, Johannes Weiner <hannes@xxxxxxxxxxx>님이 작성: > > On Tue, Mar 17, 2020 at 02:41:54PM +0900, js1304@xxxxxxxxx wrote: > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > > > When implementing workingset detection for anonymous page, I found > > some swapcache pages with NULL memcg. From the code reading, I found > > two reasons. > > > > One is the case that swap-in readahead happens. The other is the > > corner case related to the shmem cache. These two problems should be > > fixed, but, it's not straight-forward to fix. For example, when swap-off, > > all swapped-out pages are read into swapcache. In this case, who's the > > owner of the swapcache page? > > > > Since this problem doesn't look trivial, I decide to leave the issue and > > handles this corner case on the place where the error occurs. > > > > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > It wouldn't be hard to find out who owns this page. The code in > mem_cgroup_try_charge() is only a few lines: lookup_swap_cgroup_id() uses additional memory and is only usable if CONFIG_MEMCG_SWAP is on. > swp_entry_t ent = { .val = page_private(page), }; > unsigned short id = lookup_swap_cgroup_id(ent); > > rcu_read_lock(); > memcg = mem_cgroup_from_id(id); > if (memcg && !css_tryget_online(&memcg->css)) > memcg = NULL; > rcu_read_unlock(); > > THAT BEING SAID, I don't think we actually *want* to know the original > cgroup for readahead pages. Because before they are accessed and > charged to the original owner, the pages are sitting on the root > cgroup LRU list and follow the root group's aging speed and LRU order. Okay. Sound reasonable. > Eviction and refault tracking is about the LRU that hosts the pages. > > So IMO your patch is much less of a hack than you might think. Good! > > diff --git a/mm/workingset.c b/mm/workingset.c > > index a9f474a..8d2e83a 100644 > > --- a/mm/workingset.c > > +++ b/mm/workingset.c > > @@ -257,6 +257,10 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg) > > VM_BUG_ON_PAGE(page_count(page), page); > > VM_BUG_ON_PAGE(!PageLocked(page), page); > > > > + /* page_memcg() can be NULL if swap-in readahead happens */ > > + if (!page_memcg(page)) > > + return NULL; > > + > > advance_inactive_age(page_memcg(page), pgdat, is_file); > > > > lruvec = mem_cgroup_lruvec(target_memcg, pgdat); > > This means a readahead page that hasn't been accessed will actively > not be tracked as an eviction and later as a refault. > > I think that's the right thing to do, but I would expand the comment: Okay. I will add the following comment. Thanks. > /* > * A page can be without a cgroup here when it was brought in by swap > * readahead and nobody has touched it since. > * > * The idea behind the workingset code is to tell on page fault time > * whether pages have been previously used or not. Since this page > * hasn't been used, don't store a shadow entry for it; when it later > * faults back in, we treat it as the new page that it is. > */