On Mon, Apr 20, 2020 at 06:11:23PM -0400, Johannes Weiner wrote: > Without swap page tracking, users that are otherwise memory controlled > can easily escape their containment and allocate significant amounts > of memory that they're not being charged for. That's because swap does > readahead, but without the cgroup records of who owned the page at > swapout, readahead pages don't get charged until somebody actually > faults them into their page table and we can identify an owner task. > This can be maliciously exploited with MADV_WILLNEED, which triggers > arbitrary readahead allocations without charging the pages. > > Make swap swap page tracking an integral part of memcg and remove the > Kconfig options. In the first place, it was only made configurable to > allow users to save some memory. But the overhead of tracking cgroup > ownership per swap page is minimal - 2 byte per page, or 512k per 1G > of swap, or 0.04%. Saving that at the expense of broken containment > semantics is not something we should present as a coequal option. > > The swapaccount=0 boot option will continue to exist, and it will > eliminate the page_counter overhead and hide the swap control files, > but it won't disable swap slot ownership tracking. > > This patch makes sure we always have the cgroup records at swapin > time; the next patch will fix the actual bug by charging readahead > swap pages at swapin time rather than at fault time. > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>