2020년 4월 7일 (화) 오전 9:22, Yang Shi <shy828301@xxxxxxxxx>님이 작성: > > On Sun, Apr 5, 2020 at 6:03 PM Joonsoo Kim <js1304@xxxxxxxxx> wrote: > > > > 2020년 4월 4일 (토) 오전 3:29, Yang Shi <shy828301@xxxxxxxxx>님이 작성: > > > > > > On Thu, Apr 2, 2020 at 10:41 PM <js1304@xxxxxxxxx> wrote: > > > > > > > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > > > > > > > Currently, some swapped-in pages are not charged to the memcg until > > > > actual access to the page happens. I checked the code and found that > > > > it could cause a problem. In this implementation, even if the memcg > > > > is enabled, one can consume a lot of memory in the system by exploiting > > > > this hole. For example, one can make all the pages swapped out and > > > > then call madvise_willneed() to load the all swapped-out pages without > > > > pressing the memcg. Although actual access requires charging, it's really > > > > big benefit to load the swapped-out pages to the memory without pressing > > > > the memcg. > > > > > > > > And, for workingset detection which is implemented on the following patch, > > > > a memcg should be committed before the workingset detection is executed. > > > > For this purpose, the best solution, I think, is charging the page when > > > > adding to the swap cache. Charging there is not that hard. Caller of > > > > adding the page to the swap cache has enough information about the charged > > > > memcg. So, what we need to do is just passing this information to > > > > the right place. > > > > > > > > With this patch, specific memcg could be pressured more since readahead > > > > pages are also charged to it now. This would result in performance > > > > degradation to that user but it would be fair since that readahead is for > > > > that user. > > > > > > If I read the code correctly, the readahead pages may be *not* charged > > > to it at all but other memcgs since mem_cgroup_try_charge() would > > > retrieve the target memcg id from the swap entry then charge to it > > > (generally it is the memcg from who the page is swapped out). So, it > > > may open a backdoor to let one memcg stress other memcgs? > > > > It looks like you talk about the call path on CONFIG_MEMCG_SWAP. > > > > The owner (task) for a anonymous page cannot be changed. It means that > > the previous owner written on the swap entry will be the next user. So, > > I think that using the target memcg id from the swap entry for readahead pages > > is valid way. > > > > As you concerned, if someone can control swap-readahead to readahead > > other's swap entry, one memcg could stress other memcg by using the fact above. > > However, as far as I know, there is no explicit way to readahead other's swap > > entry so no problem. > > Swap cluster readahead would readahead in pages on consecutive swap > entries which may belong to different memcgs, however I just figured > out patch #8 ("mm/swap: do not readahead if the previous owner of the > swap entry isn't me") would prevent from reading ahead pages belonging > to other memcgs. This would kill the potential problem. Yes, that patch kill the potential problem. However, I think that swap cluster readahead would not open the backdoor even without the patch #8 in CONFIG_MEMCG_SWAP case, because: 1. consecutive swap space is usually filled by the same task. 2. swap cluster readahead needs a large I/O price to the offender and effect isn't serious to the target. 3. those pages would be charged to their previous owner and it is valid. Thanks.