On Wed, Jun 12, 2013 at 03:22:31PM -0700, Andrew Morton wrote: > On Tue, 26 Mar 2013 13:38:43 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote: > > > swap cluster allocation is to get better request merge to improve performance. > > But the cluster is shared globally, if multiple tasks are doing swap, this will > > cause interleave disk access. While multiple tasks swap is quite common, for > > example, each numa node has a kswapd thread doing swap or multiple > > threads/processes do direct page reclaim. > > > > We makes the cluster allocation per-cpu here. The interleave disk access issue > > goes away. All tasks will do sequential swap. > > Why per-cpu rather than, say, per-mm or per-task? Good question. per-cpu is easier to implement :). per-mm or per-task is likely to be more readahead friendly for sequential workload. per-cpu is better in swapout side for random workload. Not sure which is more important. I choose to solve the swapout issue. Definitely this should be revisit if we want a better swapin readahead. But if you dislike this patch, I'm ok to drop it in next post. > > If one CPU can't get its per-cpu cluster, it will fallback to scan swap_map. > > Under what circumstances can a cpu "not get its per-cpu cluster"? A > cpu can always "get" its per-cpu data, by definition (unless perhaps > interrupts are involved). Perhaps this description needs some > expanding upon. the circumstance is there is no free cluster. I'll rewrite the description. Thanks, Shaohua -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>