Huge page in current kernel could bring obvious performance improvement for some workloads with less TLB missing and less page fault. But the limited options of huge page size (2M/1G for x86_64) also brings extra cost like larger memory consumption, and more CPU cycle for page zeroing. The idea of the multiple consecutive page (abbr as "mcpage") is using collection of physical contiguous 4K page other than huge page for anonymous mapping. Target is to have more choices to trade off the pros and cons of huge page. Comparing to huge page, it will not get so much benefit of TLB missing and page fault. And it will not pay too much extra cost for large memory consumption and larger latency introduced by page compaction, page zeroing etc. The size of mcpage can be configured. The default value of 16K size is just picked up arbitrarily. User should choose the value according to the result of tuning their workload with different mcpage size. To have physical contiguous pages, high order pages is allocated (order is calculated according to mcpage size). Then the high order page will be split. By doing this, each sub page of mcpage is just normal 4K page. The current kernel page management infrastructure is applied to "mc" pages without any change. To reduce the page fault number, multiple page table entries are populated in one page fault with sub pages pfn of mcpage. This also brings a little bit cost of memory consumption. Update Kconfig to allow user define the mcpage order. Define MACROs like mcpage mask/shift/nr/size. In this RFC patch, only Kconfig is used for mcpage order to show the idea. Runtime parameter will be chosen if make this official patch in the future. Signed-off-by: Yin Fengwei <fengwei.yin@xxxxxxxxx> --- include/linux/mm_types.h | 11 +++++++++++ mm/Kconfig | 19 +++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3b8475007734..fa561c7b6290 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -71,6 +71,17 @@ struct mem_cgroup; #define _struct_page_alignment __aligned(sizeof(unsigned long)) #endif +#ifdef CONFIG_MCPAGE_ORDER +#define MCPAGE_ORDER CONFIG_MCPAGE_ORDER +#else +#define MCPAGE_ORDER 0 +#endif + +#define MCPAGE_SIZE (1 << (MCPAGE_ORDER + PAGE_SHIFT)) +#define MCPAGE_MASK (~(MCPAGE_SIZE - 1)) +#define MCPAGE_SHIFT (MCPAGE_ORDER + PAGE_SHIFT) +#define MCPAGE_NR (1 << (MCPAGE_ORDER)) + struct page { unsigned long flags; /* Atomic flags, some possibly * updated asynchronously */ diff --git a/mm/Kconfig b/mm/Kconfig index ff7b209dec05..c202dc99ab6d 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -650,6 +650,25 @@ config HUGETLB_PAGE_SIZE_VARIABLE Note that the pageblock_order cannot exceed MAX_ORDER - 1 and will be clamped down to MAX_ORDER - 1. +config MCPAGE + bool "multiple consecutive page <mcpage>" + default n + help + Enable multiple consecutive page: mcpage is page collections (sub-page) + which are physical contiguous. When mapping to user space, all the + sub-pages will be mapped to user space in one page fault handler. + Expect to trade off the pros and cons of huge page. Like less + unnecessary extra memory zeroing and less memory consumption. + But with no TLB benefit. + +config MCPAGE_ORDER + int "multiple consecutive page order" + default 2 + depends on X86_64 && MCPAGE + help + The order of mcpage. Should be chosen carefully by tuning your + workload. + config CONTIG_ALLOC def_bool (MEMORY_ISOLATION && COMPACTION) || CMA -- 2.30.2