This patchset is not complete (hence sending as RFC), but I would like to start the discussion now and hear people's opinions regarding the questions mentioned below. === Overview This patchset adopts the existing hardware tag-based KASAN mode [1] for use in production as a memory corruption mitigation. Hardware tag-based KASAN relies on arm64 Memory Tagging Extension (MTE) [2] to perform memory and pointer tagging. Please see [3] and [4] for detailed analysis of how MTE helps to fight memory safety problems. The current plan is reuse CONFIG_KASAN_HW_TAGS for production, but add a boot time switch, that allows to choose between a debugging mode, that includes all KASAN features as they are, and a production mode, that only includes the essentials like tag checking. It is essential that switching between these modes doesn't require rebuilding the kernel with different configs, as this is required by the Android GKI initiative [5]. The patch titled "kasan: add and integrate kasan boot parameters" of this series adds a few new boot parameters: kasan.mode allows choosing one of main three modes: - kasan.mode=off - no checks at all - kasan.mode=prod - only essential production features - kasan.mode=full - all features Those mode configs provide default values for three more internal configs listed below. However it's also possible to override the default values by providing: - kasan.stack=off/on - enable stacks collection (default: on for mode=full, otherwise off) - kasan.trap=async/sync - use async or sync MTE mode (default: sync for mode=full, otherwise async) - kasan.fault=report/panic - only report MTE fault or also panic (default: report) === Benchmarks For now I've only performed a few simple benchmarks such as measuring kernel boot time and slab memory usage after boot. The benchmarks were performed in QEMU and the results below exclude the slowdown caused by QEMU memory tagging emulation (as it's different from the slowdown that will be introduced by hardware and therefore irrelevant). KASAN_HW_TAGS=y + kasan.mode=off introduces no performance or memory impact compared to KASAN_HW_TAGS=n. kasan.mode=prod (without executing the tagging instructions) introduces 7% of both performace and memory impact compared to kasan.mode=off. Note, that 4% of performance and all 7% of memory impact are caused by the fact that enabling KASAN essentially results in CONFIG_SLAB_MERGE_DEFAULT being disabled. Recommended Android config has CONFIG_SLAB_MERGE_DEFAULT disabled (I assume for security reasons), but Pixel 4 has it enabled. It's arguable, whether "disabling" CONFIG_SLAB_MERGE_DEFAULT introduces any security benefit on top of MTE. Without MTE it makes exploiting some heap corruption harder. With MTE it will only make it harder provided that the attacker is able to predict allocation tags. kasan.mode=full has 40% performance and 30% memory impact over kasan.mode=prod. Both come from alloc/free stack collection. === Questions Any concerns about the boot parameters? Should we try to deal with CONFIG_SLAB_MERGE_DEFAULT-like behavor mentioned above? === Notes This patchset is available here: https://github.com/xairy/linux/tree/up-prod-mte-rfc2 and on Gerrit here: https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/3707 This patchset is based on v5 of "kasan: add hardware tag-based mode for arm64" patchset [1] (along with some fixes). For testing in QEMU hardware tag-based KASAN requires: 1. QEMU built from master [6] (use "-machine virt,mte=on -cpu max" arguments to run). 2. GCC version 10. [1] https://lore.kernel.org/linux-arm-kernel/cover.1602535397.git.andreyknvl@xxxxxxxxxx/ [2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety [3] https://arxiv.org/pdf/1802.09517.pdf [4] https://github.com/microsoft/MSRC-Security-Research/blob/master/papers/2020/Security%20analysis%20of%20memory%20tagging.pdf [5] https://source.android.com/devices/architecture/kernel/generic-kernel-image [6] https://github.com/qemu/qemu === History Changes RFCv1->RFCv2: - Rework boot parameters. - Drop __init from empty kasan_init_tags() definition. - Add cpu_supports_mte() helper that can be used during early boot and use it in kasan_init_tags() - Lots of new KASAN optimization commits. Andrey Konovalov (21): kasan: simplify quarantine_put call site kasan: rename get_alloc/free_info kasan: introduce set_alloc_info kasan: unpoison stack only with CONFIG_KASAN_STACK kasan: allow VMAP_STACK for HW_TAGS mode kasan: mark kasan_init_tags as __init kasan, arm64: move initialization message kasan: remove __kasan_unpoison_stack kasan: inline kasan_reset_tag for tag-based modes kasan: inline random_tag for HW_TAGS kasan: inline kasan_poison_memory and check_invalid_free kasan: inline and rename kasan_unpoison_memory arm64: kasan: Add cpu_supports_tags helper kasan: add and integrate kasan boot parameters kasan: check kasan_enabled in annotations kasan: optimize poisoning in kmalloc and krealloc kasan: simplify kasan_poison_kfree kasan: rename kasan_poison_kfree kasan: don't round_up too much kasan: simplify assign_tag and set_tag calls kasan: clarify comment in __kasan_kfree_large arch/Kconfig | 2 +- arch/arm64/include/asm/memory.h | 1 + arch/arm64/include/asm/mte-kasan.h | 6 + arch/arm64/kernel/mte.c | 20 +++ arch/arm64/kernel/sleep.S | 2 +- arch/arm64/mm/kasan_init.c | 3 + arch/x86/kernel/acpi/wakeup_64.S | 2 +- include/linux/kasan.h | 225 ++++++++++++++++++------- include/linux/mm.h | 27 ++- kernel/fork.c | 2 +- mm/kasan/common.c | 256 ++++++++++++++++------------- mm/kasan/generic.c | 19 ++- mm/kasan/hw_tags.c | 182 +++++++++++++++++--- mm/kasan/kasan.h | 102 ++++++++---- mm/kasan/quarantine.c | 5 +- mm/kasan/report.c | 26 ++- mm/kasan/report_sw_tags.c | 2 +- mm/kasan/shadow.c | 1 + mm/kasan/sw_tags.c | 20 ++- mm/mempool.c | 2 +- mm/slab_common.c | 2 +- mm/slub.c | 3 +- 22 files changed, 641 insertions(+), 269 deletions(-) -- 2.29.0.rc1.297.gfa9743e501-goog