Changes v2: - Split the series into one adding KASAN tag-based mode (this one) and another one that adds the dense mode to KASAN (will post later). - Removed exporting kasan_poison() and used a wrapper instead in kasan_init_64.c - Prepended series with 4 patches from the risc-v series and applied review comments to the first patch as the rest already are reviewed. ======= Introduction The patchset aims to add a KASAN tag-based mode for the x86 architecture with the help of the new CPU feature called Linear Address Masking (LAM). Main improvement introduced by the series is 2x lower memory usage compared to KASAN's generic mode, the only currently available mode on x86. There are two relevant series in the process of adding KASAN tag-based support to x86. This one focuses on implementing and enabling the tag-based mode for the x86 architecture by using LAM. The second one attempts to add a new memory saving mechanism called "dense mode" to the non-arch part of the tag-based KASAN code. It can provide another 2x memory savings by packing tags denser in the shadow memory. ======= How KASAN tag-based mode works? When enabled, memory accesses and allocations are augmented by the compiler during kernel compilation. Instrumentation functions are added to each memory allocation and each pointer dereference. The allocation related functions generate a random tag and save it in two places: in shadow memory that maps to the allocated memory, and in the top bits of the pointer that points to the allocated memory. Storing the tag in the top of the pointer is possible because of Top-Byte Ignore (TBI) on arm64 architecture and LAM on x86. The access related functions are performing a comparison between the tag stored in the pointer and the one stored in shadow memory. If the tags don't match an out of bounds error must have occurred and so an error report is generated. The general idea for the tag-based mode is very well explained in the series with the original implementation [1]. [1] https://lore.kernel.org/all/cover.1544099024.git.andreyknvl@xxxxxxxxxx/ ======= Differences summary compared to the arm64 tag-based mode - Tag width: - Tag width influences the chance of a tag mismatch due to two tags from different allocations having the same value. The bigger the possible range of tag values the lower the chance of that happening. - Shortening the tag width from 8 bits to 4, while it can help with memory usage, it also increases the chance of not reporting an error. 4 bit tags have a ~7% chance of a tag mismatch. - TBI and LAM - TBI in arm64 allows for storing metadata in the top 8 bits of the virtual address. - LAM in x86 allows storing tags in bits [62:57] of the pointer. To maximize memory savings the tag width is reduced to bits [60:57]. - KASAN offset calculations - When converting addresses from memory to shadow memory the address is treated as a signed number. - On arm64 due to TBI half of tagged addresses will be positive and half negative. KASAN shadow offset means the middle point of the shadow memory there. - On x86 - due to the top bit of the pointer always being set in kernel address space - all the addresses will be negative when treated as signed offsets into shadow memory. KASAN shadow offset means the end of the shadow memory there. ======= Testing Checked all the kunits for both software tags and generic KASAN after making changes. In generic mode the results were: kasan: pass:59 fail:0 skip:13 total:72 Totals: pass:59 fail:0 skip:13 total:72 ok 1 kasan and for software tags: kasan: pass:63 fail:0 skip:9 total:72 Totals: pass:63 fail:0 skip:9 total:72 ok 1 kasan ======= Benchmarks All tests were ran on a Sierra Forest server platform with 512GB of memory. The only differences between the tests were kernel options: - CONFIG_KASAN - CONFIG_KASAN_GENERIC - CONFIG_KASAN_SW_TAGS - CONFIG_KASAN_INLINE [1] - CONFIG_KASAN_OUTLINE [1] More benchmarks are noted in the second series that adds the dense mode to KASAN. That's because most values on x86' tag-based mode are tailored to work well with that. Boot time (until login prompt): * 03:48 for clean kernel * 08:02 / 09:45 for generic KASAN (inline/outline) * 08:50 for tag-based KASAN * 04:50 for tag-based KASAN with stacktrace disabled [2] Compilation time comparison (10 cores): * 7:27 for clean kernel * 8:21/7:44 for generic KASAN (inline/outline) * 7:41 for tag-based KASAN [1] Based on hwasan and asan compiler parameters used in scripts/Makefile.kasan it looks like inline/outline modes have a bigger impact on generic mode than the tag-based mode. In the former inlining actually increases the kernel image size and improves performance. In the latter it un-inlines some code portions for debugging purposes when the outline mode is chosen but no real difference is visible in performance and kernel image size. [2] Memory allocation and freeing performance suffers heavily from saving stacktraces that can be later displayed in error reports. ======= Compilation Clang was used to compile the series (make LLVM=1) since gcc doesn't seem to have support for KASAN tag-based compiler instrumentation on x86. ======= Dependencies Series bases on four patches taken from [1] series. Mainly the idea of converting memory addresses to shadow memory addresses by treating them as signed as opposed to unsigned. I tried applying review comments to the first patch, while the other three are unchanged. The base branch for the series is the tip x86/mm branch. [1] https://lore.kernel.org/all/20241022015913.3524425-1-samuel.holland@xxxxxxxxxx/ ======= Enabling LAM for testing the series without LASS Since LASS is needed for LAM and it can't be compiled without it I enabled LAM during testing with the patch below: --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2275,7 +2275,7 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING config ADDRESS_MASKING bool "Linear Address Masking support" depends on X86_64 - depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS + #depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS help Linear Address Masking (LAM) modifies the checking that is applied to 64-bit linear addresses, allowing software to use of the --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2401,9 +2401,10 @@ void __init arch_cpu_finalize_init(void) /* * Enable this when LAM is gated on LASS support + */ if (cpu_feature_enabled(X86_FEATURE_LAM)) USER_PTR_MAX = (1ul << 63) - PAGE_SIZE; - */ + runtime_const_init(ptr, USER_PTR_MAX); /* Maciej Wieczor-Retman (10): kasan: arm64: x86: Make special tags arch specific x86: Add arch specific kasan functions x86: Reset tag for virtual to physical address conversions x86: Physical address comparisons in fill_p*d/pte mm: Pcpu chunk address tag reset x86: KASAN raw shadow memory PTE init x86: LAM initialization x86: Minimal SLAB alignment x86: runtime_const used for KASAN_SHADOW_END x86: Make software tag-based kasan available Samuel Holland (4): kasan: sw_tags: Use arithmetic shift for shadow computation kasan: sw_tags: Check kasan_flag_enabled at runtime kasan: sw_tags: Support outline stack tag generation kasan: sw_tags: Support tag widths less than 8 bits Documentation/arch/x86/x86_64/mm.rst | 6 ++-- MAINTAINERS | 2 +- arch/arm64/Kconfig | 10 +++--- arch/arm64/include/asm/kasan-tags.h | 9 +++++ arch/arm64/include/asm/kasan.h | 6 ++-- arch/arm64/include/asm/memory.h | 14 +++++++- arch/arm64/include/asm/uaccess.h | 1 + arch/arm64/mm/kasan_init.c | 7 ++-- arch/x86/Kconfig | 9 +++-- arch/x86/boot/compressed/misc.h | 1 + arch/x86/include/asm/kasan-tags.h | 9 +++++ arch/x86/include/asm/kasan.h | 50 +++++++++++++++++++++++++--- arch/x86/include/asm/page.h | 17 +++++++--- arch/x86/include/asm/page_64.h | 2 +- arch/x86/kernel/head_64.S | 3 ++ arch/x86/kernel/setup.c | 2 ++ arch/x86/kernel/vmlinux.lds.S | 1 + arch/x86/mm/init.c | 3 ++ arch/x86/mm/init_64.c | 11 +++--- arch/x86/mm/kasan_init_64.c | 24 ++++++++++--- arch/x86/mm/physaddr.c | 1 + include/linux/kasan-enabled.h | 15 +++------ include/linux/kasan-tags.h | 19 ++++++++--- include/linux/kasan.h | 14 ++++++-- include/linux/mm.h | 6 ++-- include/linux/page-flags-layout.h | 7 +--- mm/kasan/hw_tags.c | 10 ------ mm/kasan/kasan.h | 2 ++ mm/kasan/report.c | 26 ++++++++++++--- mm/kasan/sw_tags.c | 9 +++++ mm/kasan/tags.c | 10 ++++++ mm/percpu-vm.c | 2 +- scripts/gdb/linux/mm.py | 5 +-- 33 files changed, 237 insertions(+), 76 deletions(-) create mode 100644 arch/arm64/include/asm/kasan-tags.h create mode 100644 arch/x86/include/asm/kasan-tags.h -- 2.47.1