In our test of mlock, we have found some severe performance regression in it. Some more investigations show that mlocked is blocked heavily by lur_add_drain_all which calls schedule_on_each_cpu and flush the work queue which is very slower if we have several cpus. So we have tried 2 ways to solve it: 1. Add a per cpu counter for all the pagevecs so that we don't schedule and flush the lru_drain work if the cpu doesn't have any pagevecs(I have finished the codes already). 2. Remove the lru_add_drain_all. The first one has some problems since in our product system, all the cpus are busy, so I guess there is very little chance for a cpu to have 0 pagevecs except that you run several consecutive mlocks. >From the commit log which added this function(8891d6da), it seems that we don't have to call it. So the 2nd one seems to be both easy and workable and comes this patch. Thanks Tao >From 8cdf7f7ed236367e85151db65ae06f781aca7d77 Mon Sep 17 00:00:00 2001 From: Tao Ma <boyu.mt@xxxxxxxxxx> Date: Fri, 30 Dec 2011 14:20:08 +0800 Subject: [PATCH] mm: do not drain pagevecs for mlock In 8891d6da, lru_add_drain_all is added to mlock to flush all the per cpu pagevecs. It makes this system call runs much slower than the predecessor(For a 16 core Xeon E5620, it is around 20 times). And the the more cores we have, the more the performance penalty because of the nasty call to schedule_on_each_cpu. >From the commit log of 8891d6da we can see that "it isn't must. but it reduce the failure of moving to unevictable list. its failure can rescue in vmscan later." Christoph Lameter removes the call in mlockall(ML_FUTURE), So this patch just removes all the call from mlock/mlockall. Without this patch: time ./test_mlock -c 100000 real 0m20.566s user 0m0.074s sys 0m12.759s With this patch: time ./test_mlock -c 100000 real 0m1.675s user 0m0.049s sys 0m1.622s Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Minchan Kim <minchan.kim@xxxxxxxxx> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Johannes Weiner <jweiner@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Tao Ma <boyu.mt@xxxxxxxxxx> --- mm/mlock.c | 5 ----- 1 files changed, 0 insertions(+), 5 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 4f4f53b..bb5fc42 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -487,8 +487,6 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len) if (!can_do_mlock()) return -EPERM; - lru_add_drain_all(); /* flush pagevec */ - down_write(¤t->mm->mmap_sem); len = PAGE_ALIGN(len + (start & ~PAGE_MASK)); start &= PAGE_MASK; @@ -557,9 +555,6 @@ SYSCALL_DEFINE1(mlockall, int, flags) if (!can_do_mlock()) goto out; - if (flags & MCL_CURRENT) - lru_add_drain_all(); /* flush pagevec */ - down_write(¤t->mm->mmap_sem); lock_limit = rlimit(RLIMIT_MEMLOCK); -- 1.7.4.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>