> -----邮件原件----- > 发件人: Michal Hocko [mailto:mhocko@xxxxxxxxxx] > 发送时间: 2018年3月19日 16:54 > 收件人: Li,Rongqing <lirongqing@xxxxxxxxx> > 抄送: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > cgroups@xxxxxxxxxxxxxxx; hannes@xxxxxxxxxxx; Andrey Ryabinin > <aryabinin@xxxxxxxxxxxxx> > 主题: Re: [PATCH] mm/memcontrol.c: speed up to force empty a memory > cgroup > > On Mon 19-03-18 16:29:30, Li RongQing wrote: > > mem_cgroup_force_empty() tries to free only 32 (SWAP_CLUSTER_MAX) > > pages on each iteration, if a memory cgroup has lots of page cache, it > > will take many iterations to empty all page cache, so increase the > > reclaimed number per iteration to speed it up. same as in > > mem_cgroup_resize_limit() > > > > a simple test show: > > > > $dd if=aaa of=bbb bs=1k count=3886080 > > $rm -f bbb > > $time echo 100000000 >/cgroup/memory/test/memory.limit_in_bytes > > > > Before: 0m0.252s ===> after: 0m0.178s > > Andrey was proposing something similar [1]. My main objection was that his > approach might lead to over-reclaim. Your approach is more conservative > because it just increases the batch size. The size is still rather arbitrary. Same > as SWAP_CLUSTER_MAX but that one is a commonly used unit of reclaim in > the MM code. > > I would be really curious about more detailed explanation why having a > larger batch yields to a better performance because we are doingg > SWAP_CLUSTER_MAX batches at the lower reclaim level anyway. > Although SWAP_CLUSTER_MAX is used at the lower level, but the call stack of try_to_free_mem_cgroup_pages is too long, increase the nr_to_reclaim can reduce times of calling function[do_try_to_free_pages, shrink_zones, hrink_node ] mem_cgroup_resize_limit --->try_to_free_mem_cgroup_pages: .nr_to_reclaim = max(1024, SWAP_CLUSTER_MAX), ---> do_try_to_free_pages ---> shrink_zones --->shrink_node ---> shrink_node_memcg ---> shrink_list <-------loop will happen in this place [times=1024/32] ---> shrink_page_list > [1] > http://lkml.kernel.org/r/20180119132544.19569-2-aryabinin@xxxxxxxxxxxx > m > > > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > --- > > mm/memcontrol.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c index > > 670e99b68aa6..8910d9e8e908 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2480,7 +2480,7 @@ static int mem_cgroup_resize_limit(struct > mem_cgroup *memcg, > > if (!ret) > > break; > > > > - if (!try_to_free_mem_cgroup_pages(memcg, 1, > > + if (!try_to_free_mem_cgroup_pages(memcg, 1024, > > GFP_KERNEL, !memsw)) { > > ret = -EBUSY; > > break; > > @@ -2610,7 +2610,7 @@ static int mem_cgroup_force_empty(struct > mem_cgroup *memcg) > > if (signal_pending(current)) > > return -EINTR; > > > > - progress = try_to_free_mem_cgroup_pages(memcg, 1, > > + progress = try_to_free_mem_cgroup_pages(memcg, 1024, > > GFP_KERNEL, true); > > if (!progress) { > > nr_retries--; > > -- > > 2.11.0 > > -- > Michal Hocko > SUSE Labs