On Mon, 26 Jul 2010 13:08:27 +0800 Shaohua Li <shaohua.li@xxxxxxxxx> wrote: > The zone->lru_lock is heavily contented in workload where activate_page() > is frequently used. We could do batch activate_page() to reduce the lock > contention. The batched pages will be added into zone list when the pool > is full or page reclaim is trying to drain them. > > For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes, > processes shared map to the file. Each process read access the whole file and > then exit. The process exit will do unmap_vmas() and cause a lot of > activate_page() call. In such workload, we saw about 58% total time reduction > with below patch. What happened to the 2% regression that earlier changelogs mentioned? afacit the patch optimises the rare munmap() case. But what effect does it have upon the common case? How do we know that it is a net benefit? Because the impact on kernel footprint is awful. x86_64 allmodconfig: text data bss dec hex filename 5857 1426 1712 8995 2323 mm/swap.o 6245 1587 1840 9672 25c8 mm/swap.o and look at x86_64 allnoconfig: text data bss dec hex filename 2344 768 4 3116 c2c mm/swap.o 2632 896 4 3532 dcc mm/swap.o that's a uniprocessor kernel where none of this was of any use! Looking at the patch, I'm not sure where all this bloat came from. But the SMP=n case is pretty bad and needs fixing, IMO. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>