On Fri, 1 Jul 2022 at 16:23, Alexander Potapenko <glider@xxxxxxxxxx> wrote: > > Add Documentation/dev-tools/kmsan.rst and reference it in the dev-tools > index. > > Signed-off-by: Alexander Potapenko <glider@xxxxxxxxxx> > --- > v2: > -- added a note that KMSAN is not intended for production use > > v4: > -- describe CONFIG_KMSAN_CHECK_PARAM_RETVAL > -- drop mentions of cpu_entry_area > -- add SPDX license > > Link: https://linux-review.googlesource.com/id/I751586f79418b95550a83c6035c650b5b01567cc > --- > Documentation/dev-tools/index.rst | 1 + > Documentation/dev-tools/kmsan.rst | 422 ++++++++++++++++++++++++++++++ > 2 files changed, 423 insertions(+) > create mode 100644 Documentation/dev-tools/kmsan.rst > > diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst > index 4621eac290f46..6b0663075dc04 100644 > --- a/Documentation/dev-tools/index.rst > +++ b/Documentation/dev-tools/index.rst > @@ -24,6 +24,7 @@ Documentation/dev-tools/testing-overview.rst > kcov > gcov > kasan > + kmsan > ubsan > kmemleak > kcsan > diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst > new file mode 100644 > index 0000000000000..3fa5d7fb222c9 > --- /dev/null > +++ b/Documentation/dev-tools/kmsan.rst > @@ -0,0 +1,422 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. Copyright (C) 2022, Google LLC. > + > +============================= > +KernelMemorySanitizer (KMSAN) > +============================= To be consistent with other tools, I think we have settled on "The Kernel <...> Sanitizer (K?SAN)", see Documentation/dev-tools/k[ac]san.rst. So this will be "The Kernel Memory Sanitizer (KMSAN)". > +KMSAN is a dynamic error detector aimed at finding uses of uninitialized > +values. It is based on compiler instrumentation, and is quite similar to the > +userspace `MemorySanitizer tool`_. > + > +An important note is that KMSAN is not intended for production use, because it > +drastically increases kernel memory footprint and slows the whole system down. > + > +Example report > +============== > + > +Here is an example of a KMSAN report:: > + > + ===================================================== > + BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test] > + test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273 > + kunit_run_case_internal lib/kunit/test.c:333 > + kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 > + kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 > + kthread+0x721/0x850 kernel/kthread.c:327 > + ret_from_fork+0x1f/0x30 ??:? > + > + Uninit was stored to memory at: > + do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260 > + test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 > + kunit_run_case_internal lib/kunit/test.c:333 > + kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 > + kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 > + kthread+0x721/0x850 kernel/kthread.c:327 > + ret_from_fork+0x1f/0x30 ??:? > + > + Local variable uninit created at: > + do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256 > + test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 > + > + Bytes 4-7 of 8 are uninitialized > + Memory access of size 8 starts at ffff888083fe3da0 > + > + CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G B E 5.16.0-rc3+ #104 > + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > + ===================================================== > + > + > +The report says that the local variable ``uninit`` was created uninitialized in > +``do_uninit_local_array()``. The lower stack trace corresponds to the place -> "The third stack trace ..." (Because it looks like there's also another stack trace in the middle and "lower" is ambiguous) > +where this variable was created. > + > +The upper stack shows where the uninit value was used - in -> "The first stack trace shows where the uninit value was used (in ``test_uninit_kmsan_check_memory()``)." > +``test_uninit_kmsan_check_memory()``. The tool shows the bytes which were left > +uninitialized in the local variable, as well as the stack where the value was > +copied to another memory location before use. > + > +A use of uninitialized value ``v`` is reported by KMSAN in the following cases: > + - in a condition, e.g. ``if (v) { ... }``; > + - in an indexing or pointer dereferencing, e.g. ``array[v]`` or ``*v``; > + - when it is copied to userspace or hardware, e.g. ``copy_to_user(..., &v, ...)``; > + - when it is passed as an argument to a function, and > + ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` is enabled (see below). > + > +The mentioned cases (apart from copying data to userspace or hardware, which is > +a security issue) are considered undefined behavior from the C11 Standard point > +of view. > + > +KMSAN and Clang > +=============== The KASAN documentation has a section on "Support" which lists architectures and compilers supported. I'd try to mirror (or improve on) that. > +In order for KMSAN to work the kernel must be built with Clang, which so far is > +the only compiler that has KMSAN support. The kernel instrumentation pass is > +based on the userspace `MemorySanitizer tool`_. > + > +How to build > +============ I'd call it "Usage", like in the KASAN and KCSAN documentation. > +In order to build a kernel with KMSAN you will need a fresh Clang (14.0.0+). > +Please refer to `LLVM documentation`_ for the instructions on how to build Clang. > + > +Now configure and build the kernel with CONFIG_KMSAN enabled. I would move build/usage instructions right after introduction as that's most likely what users of KMSAN will want to know about first. > +How KMSAN works > +=============== > + > +KMSAN shadow memory > +------------------- > + > +KMSAN associates a metadata byte (also called shadow byte) with every byte of > +kernel memory. A bit in the shadow byte is set iff the corresponding bit of the > +kernel memory byte is uninitialized. Marking the memory uninitialized (i.e. > +setting its shadow bytes to ``0xff``) is called poisoning, marking it > +initialized (setting the shadow bytes to ``0x00``) is called unpoisoning. > + > +When a new variable is allocated on the stack, it is poisoned by default by > +instrumentation code inserted by the compiler (unless it is a stack variable > +that is immediately initialized). Any new heap allocation done without > +``__GFP_ZERO`` is also poisoned. > + > +Compiler instrumentation also tracks the shadow values with the help from the > +runtime library in ``mm/kmsan/``. This sentence might still be confusing. I think it should highlight that runtime and compiler go together, but depending on the scope of the value, the compiler invokes the runtime to persist the shadow. > +The shadow value of a basic or compound type is an array of bytes of the same > +length. When a constant value is written into memory, that memory is unpoisoned. > +When a value is read from memory, its shadow memory is also obtained and > +propagated into all the operations which use that value. For every instruction > +that takes one or more values the compiler generates code that calculates the > +shadow of the result depending on those values and their shadows. > + > +Example:: > + > + int a = 0xff; // i.e. 0x000000ff > + int b; > + int c = a | b; > + > +In this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``, > +shadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of > +``c`` are uninitialized, while the lower byte is initialized. > + > + There are 2 blank lines here, which is inconsistent with the rest of the document. > +Origin tracking > +--------------- > + > +Every four bytes of kernel memory also have a so-called origin assigned to Is "assigned" or "mapped" more appropriate here? > +them. This origin describes the point in program execution at which the > +uninitialized value was created. Every origin is associated with either the > +full allocation stack (for heap-allocated memory), or the function containing > +the uninitialized variable (for locals). > + > +When an uninitialized variable is allocated on stack or heap, a new origin > +value is created, and that variable's origin is filled with that value. > +When a value is read from memory, its origin is also read and kept together > +with the shadow. For every instruction that takes one or more values the origin s/values the origin/values, the origin/ > +of the result is one of the origins corresponding to any of the uninitialized > +inputs. If a poisoned value is written into memory, its origin is written to the > +corresponding storage as well. > + > +Example 1:: > + > + int a = 42; > + int b; > + int c = a + b; > + > +In this case the origin of ``b`` is generated upon function entry, and is > +stored to the origin of ``c`` right before the addition result is written into > +memory. > + > +Several variables may share the same origin address, if they are stored in the > +same four-byte chunk. In this case every write to either variable updates the > +origin for all of them. We have to sacrifice precision in this case, because > +storing origins for individual bits (and even bytes) would be too costly. > + > +Example 2:: > + > + int combine(short a, short b) { > + union ret_t { > + int i; > + short s[2]; > + } ret; > + ret.s[0] = a; > + ret.s[1] = b; > + return ret.i; > + } > + > +If ``a`` is initialized and ``b`` is not, the shadow of the result would be > +0xffff0000, and the origin of the result would be the origin of ``b``. > +``ret.s[0]`` would have the same origin, but it will be never used, because s/be never/never be/ > +that variable is initialized. > + > +If both function arguments are uninitialized, only the origin of the second > +argument is preserved. > + > +Origin chaining > +~~~~~~~~~~~~~~~ > + > +To ease debugging, KMSAN creates a new origin for every store of an > +uninitialized value to memory. The new origin references both its creation stack > +and the previous origin the value had. This may cause increased memory > +consumption, so we limit the length of origin chains in the runtime. > + > +Clang instrumentation API > +------------------------- > + > +Clang instrumentation pass inserts calls to functions defined in > +``mm/kmsan/instrumentation.c`` into the kernel code. > + > +Shadow manipulation > +~~~~~~~~~~~~~~~~~~~ > + > +For every memory access the compiler emits a call to a function that returns a > +pair of pointers to the shadow and origin addresses of the given memory:: > + > + typedef struct { > + void *shadow, *origin; > + } shadow_origin_ptr_t > + > + shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr) > + shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr) > + shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size) > + shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size) > + > +The function name depends on the memory access size. > + > +The compiler makes sure that for every loaded value its shadow and origin > +values are read from memory. When a value is stored to memory, its shadow and > +origin are also stored using the metadata pointers. > + > +Handling locals > +~~~~~~~~~~~~~~~ > + > +A special function is used to create a new origin value for a local variable and > +set the origin of that variable to that value:: > + > + void __msan_poison_alloca(void *addr, uintptr_t size, char *descr) > + > +Access to per-task data > +~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +At the beginning of every instrumented function KMSAN inserts a call to > +``__msan_get_context_state()``:: > + > + kmsan_context_state *__msan_get_context_state(void) > + > +``kmsan_context_state`` is declared in ``include/linux/kmsan.h``:: > + > + struct kmsan_context_state { > + char param_tls[KMSAN_PARAM_SIZE]; > + char retval_tls[KMSAN_RETVAL_SIZE]; > + char va_arg_tls[KMSAN_PARAM_SIZE]; > + char va_arg_origin_tls[KMSAN_PARAM_SIZE]; > + u64 va_arg_overflow_size_tls; > + char param_origin_tls[KMSAN_PARAM_SIZE]; > + depot_stack_handle_t retval_origin_tls; > + }; > + > +This structure is used by KMSAN to pass parameter shadows and origins between > +instrumented functions (unless the parameters are checked immediately by > +``CONFIG_KMSAN_CHECK_PARAM_RETVAL``). > + > +Passing uninitialized values to functions > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +KMSAN instrumentation pass has an option, ``-fsanitize-memory-param-retval``, "KMSAN instrumentation pass" -> "Clang's instrumentation support" ? Because it seems wrong to say that KMSAN has the instrumentation pass. > +which makes the compiler check function parameters passed by value, as well as > +function return values. > + > +The option is controlled by ``CONFIG_KMSAN_CHECK_PARAM_RETVAL``, which is > +enabled by default to let KMSAN report uninitialized values earlier. > +Please refer to the `LKML discussion`_ for more details. > + > +Because of the way the checks are implemented in LLVM (they are only applied to > +parameters marked as ``noundef``), not all parameters are guaranteed to be > +checked, so we cannot give up the metadata storage in ``kmsan_context_state``. > + > +String functions > +~~~~~~~~~~~~~~~~ > + > +The compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the > +following functions. These functions are also called when data structures are > +initialized or copied, making sure shadow and origin values are copied alongside > +with the data:: > + > + void *__msan_memcpy(void *dst, void *src, uintptr_t n) > + void *__msan_memmove(void *dst, void *src, uintptr_t n) > + void *__msan_memset(void *dst, int c, uintptr_t n) > + > +Error reporting > +~~~~~~~~~~~~~~~ > + > +For each use of a value the compiler emits a shadow check that calls > +``__msan_warning()`` in the case that value is poisoned:: > + > + void __msan_warning(u32 origin) > + > +``__msan_warning()`` causes KMSAN runtime to print an error report. > + > +Inline assembly instrumentation > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +KMSAN instruments every inline assembly output with a call to:: > + > + void __msan_instrument_asm_store(void *addr, uintptr_t size) > + > +, which unpoisons the memory region. > + > +This approach may mask certain errors, but it also helps to avoid a lot of > +false positives in bitwise operations, atomics etc. > + > +Sometimes the pointers passed into inline assembly do not point to valid memory. > +In such cases they are ignored at runtime. > + > +Disabling the instrumentation > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It would be useful to move this section somewhere to the beginning, closer to usage and the example, as this is information that a user of KMSAN might want to know (but they might not want to know much about how KMSAN works). > +A function can be marked with ``__no_kmsan_checks``. Doing so makes KMSAN > +ignore uninitialized values in that function and mark its output as initialized. > +As a result, the user will not get KMSAN reports related to that function. > + > +Another function attribute supported by KMSAN is ``__no_sanitize_memory``. > +Applying this attribute to a function will result in KMSAN not instrumenting it, > +which can be helpful if we do not want the compiler to mess up some low-level s/mess up/interfere with/ > +code (e.g. that marked with ``noinstr``). maybe "... (e.g. that marked with ``noinstr``, which implicitly adds ``__no_sanitize_memory``)." otherwise people might think that it's necessary to add __no_sanitize_memory explicitly to noinstr. > + > +This however comes at a cost: stack allocations from such functions will have > +incorrect shadow/origin values, likely leading to false positives. Functions > +called from non-instrumented code may also receive incorrect metadata for their > +parameters. > + > +As a rule of thumb, avoid using ``__no_sanitize_memory`` explicitly. > + > +It is also possible to disable KMSAN for a single file (e.g. main.o):: > + > + KMSAN_SANITIZE_main.o := n > + > +or for the whole directory:: > + > + KMSAN_SANITIZE := n > + > +in the Makefile. Think of this as applying ``__no_sanitize_memory`` to every > +function in the file or directory. Most users won't need KMSAN_SANITIZE, unless > +their code gets broken by KMSAN (e.g. runs at early boot time). > + > +Runtime library > +--------------- > + > +The code is located in ``mm/kmsan/``. > + > +Per-task KMSAN state > +~~~~~~~~~~~~~~~~~~~~ > + > +Every task_struct has an associated KMSAN task state that holds the KMSAN > +context (see above) and a per-task flag disallowing KMSAN reports:: > + > + struct kmsan_context { > + ... > + bool allow_reporting; > + struct kmsan_context_state cstate; > + ... > + } > + > + struct task_struct { > + ... > + struct kmsan_context kmsan; > + ... > + } > + > + 1 blank line instead of 2? > +KMSAN contexts > +~~~~~~~~~~~~~~ > + > +When running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to > +hold the metadata for function parameters and return values. > + > +But in the case the kernel is running in the interrupt, softirq or NMI context, > +where ``current`` is unavailable, KMSAN switches to per-cpu interrupt state:: > + > + DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx); > + > +Metadata allocation > +~~~~~~~~~~~~~~~~~~~ > + > +There are several places in the kernel for which the metadata is stored. > + > +1. Each ``struct page`` instance contains two pointers to its shadow and > +origin pages:: > + > + struct page { > + ... > + struct page *shadow, *origin; > + ... > + }; > + > +At boot-time, the kernel allocates shadow and origin pages for every available > +kernel page. This is done quite late, when the kernel address space is already > +fragmented, so normal data pages may arbitrarily interleave with the metadata > +pages. > + > +This means that in general for two contiguous memory pages their shadow/origin > +pages may not be contiguous. So, if a memory access crosses the boundary s/So, /Consequently, / > +of a memory block, accesses to shadow/origin memory may potentially corrupt > +other pages or read incorrect values from them. > + > +In practice, contiguous memory pages returned by the same ``alloc_pages()`` > +call will have contiguous metadata, whereas if these pages belong to two > +different allocations their metadata pages can be fragmented. > + > +For the kernel data (``.data``, ``.bss`` etc.) and percpu memory regions > +there also are no guarantees on metadata contiguity. > + > +In the case ``__msan_metadata_ptr_for_XXX_YYY()`` hits the border between two > +pages with non-contiguous metadata, it returns pointers to fake shadow/origin regions:: > + > + char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); > + char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); > + > +``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes. > +All stores to ``dummy_store_page`` are ignored. > + > +2. For vmalloc memory and modules, there is a direct mapping between the memory > +range, its shadow and origin. KMSAN reduces the vmalloc area by 3/4, making only > +the first quarter available to ``vmalloc()``. The second quarter of the vmalloc > +area contains shadow memory for the first quarter, the third one holds the > +origins. A small part of the fourth quarter contains shadow and origins for the > +kernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for > +more details. > + > +When an array of pages is mapped into a contiguous virtual memory space, their > +shadow and origin pages are similarly mapped into contiguous regions. > + > +References > +========== > + > +E. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized > +memory use in C++ > +<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_. > +In Proceedings of CGO 2015. > + > +.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html > +.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html > +.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@xxxxxxxxxx/ > -- > 2.37.0.rc0.161.g10f37bed90-goog >