Changes since v3 [1]: * Replace runtime 'shuffle_page_order' parameter with a compile-time CONFIG_SHUFFLE_PAGE_ALLOCATOR on/off switch and a CONFIG_SHUFFLE_PAGE_ORDER if a distro decides that the default 4MB shuffling boundary is not sufficient. Administrators will not be burdened with making this decision. (Michal) * Move shuffle related code into a new mm/shuffle.c file. [1]: https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1783262.html --- Some data exfiltration and return-oriented-programming attacks rely on the ability to infer the location of sensitive data objects. The kernel page allocator, especially early in system boot, has predictable first-in-first out behavior for physical pages. Pages are freed in physical address order when first onlined. Quoting Kees: "While we already have a base-address randomization (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and memory layouts would certainly be using the predictability of allocation ordering (i.e. for attacks where the base address isn't important: only the relative positions between allocated memory). This is common in lots of heap-style attacks. They try to gain control over ordering by spraying allocations, etc. I'd really like to see this because it gives us something similar to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator." Another motivation for this change is performance in the presence of a memory-side cache. In the future, memory-side-cache technology will be available on generally available server platforms. The proposed randomization approach has been measured to improve the cache conflict rate by a factor of 2.5X on a well-known Java benchmark. It avoids performance peaks and valleys to provide more predictable performance. The initial randomization in patch1 can be undone over time so patch3 is introduced to inject entropy on page free decisions. It is reasonable to ask if the page free entropy is sufficient, but it is not enough due to the in-order initial freeing of pages. At the start of that process putting page1 in front or behind page0 still keeps them close together, page2 is still near page1 and has a high chance of being adjacent. As more pages are added ordering diversity improves, but there is still high page locality for the low address pages and this leads to no significant impact to the cache conflict rate. More details in the patch1 commit message. --- Dan Williams (3): mm: Shuffle initial free memory mm: Move buddy list manipulations into helpers mm: Maintain randomization of page free lists include/linux/list.h | 17 ++++ include/linux/mm.h | 30 +++++++ include/linux/mm_types.h | 3 + include/linux/mmzone.h | 65 ++++++++++++++++ init/Kconfig | 32 ++++++++ mm/Makefile | 1 mm/compaction.c | 4 - mm/memblock.c | 9 ++ mm/memory_hotplug.c | 2 mm/page_alloc.c | 81 +++++++++----------- mm/shuffle.c | 186 ++++++++++++++++++++++++++++++++++++++++++++++ 11 files changed, 381 insertions(+), 49 deletions(-) create mode 100644 mm/shuffle.c