Hello, This patchset is combination of the following two patchsets. [1] x86,percpu: generalize 4k and lpage allocator [2] percpu: teach lpage allocator about NUMA Changes from the last postings are * updated to be on top of the current percpu#for-next(bf4bb2b1) * sparc64 was converted to dynamic percpu allocator and using pcpu_setup_first_chunk() which is changed by this patchset. sparc64 updated accordingly. This patchset contains the following patches. 0001-x86-make-pcpu_chunk_addr_search-matching-stricter.patch 0002-percpu-drop-unit_size-from-embed-first-chunk-alloc.patch 0003-x86-percpu-generalize-4k-first-chunk-allocator.patch 0004-percpu-make-4k-first-chunk-allocator-map-memory.patch 0005-x86-percpu-generalize-lpage-first-chunk-allocator.patch 0006-percpu-simplify-pcpu_setup_first_chunk.patch 0007-percpu-reorder-a-few-functions-in-mm-percpu.c.patch 0008-percpu-drop-pcpu_chunk-page.patch 0009-percpu-allow-non-linear-sparse-cpu-unit-mappin.patch 0010-percpu-teach-large-page-allocator-about-NUMA.patch 0001-0006 generalizes first chunk allocators. 0007-0010 improves lpage allocator such that NUMA is handled more intelligently. This patchset first generalizes first chunk allocators, makes the percpu allocator to be able to use non-linear and/or sparse cpu -> unit mapping and then make lpage allocator consider CPU topology and group CPUs in LOCAL_DISTANCE into the same large pages. For example, on an 4/4 NUMA machine, the original code used up 16MB for each chunk but the new code uses only 4MB - one large page for each NUMA node. The grouping code is quite robust and will try to minimize space wastage even when the CPU topology is asymmetric. David, sparc64 should be able to use lpage (renamed from remap) allocator the same way x86_64 does. Well, at least that was my intention, if something doesn't work or needs improvements for sparc64, please let me know. This patchset is available in the following git tree and will be published in for-next if there's no major objection. It might get rebased before going into for-next. git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git lpage-numa diffstat follows. arch/sparc/kernel/smp_64.c | 42 - arch/x86/include/asm/percpu.h | 9 arch/x86/kernel/setup_percpu.c | 297 ++------- arch/x86/mm/pageattr.c | 1 include/linux/percpu.h | 68 +- mm/percpu.c | 1276 +++++++++++++++++++++++++++++++---------- 6 files changed, 1139 insertions(+), 554 deletions(-) Thanks. -- tejun [1] http://thread.gmane.org/gmane.linux.kernel/853114 [2] http://lkml.org/lkml/2009/6/17/14 -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html