> Could all these just be using '.macro .endm'? Done in v4 > > +__no_sanitize_memory > > +static inline unsigned long KMSAN_INIT_8(unsigned long value) > > +{ > > + return value; > > +} > > Should the above be __always_inline? No. __always_inline forces a non-instrumented function to be inlined into its instrumented caller, which results in the former being instrumented. I've updated the comment to reflect that. > Does it make sense to use u8, u16, u32, u64 here -- just in case it's > ported to other architectures in future? Done in v4. > > + default: \ > > + BUILD_BUG_ON(1); \ > > + } \ > > + __ret; \ > > + }) /**/ > > Is the /**/ needed? No, as long as we use .macro and .endm. > > It would be good to add doc comments to all API functions. Done in v4 > > +extern bool kmsan_ready; > > What does this variable mean. Would 'kmsan_enabled' be more accurate? I think kmsan_inited is a better name, if we want to change it at all. kmsan_enabled somewhat implies KMSAN can be disabled. > This is in include/linux -- do they need a KMSAN_ prefix to not clash > with other definitions? Done in v4. > > +#define KMSAN_PARAM_SIZE 800 > > + > > +#define PARAM_ARRAY_SIZE (KMSAN_PARAM_SIZE / sizeof(depot_stack_handle_t)) > > Similar here -- does it need a KMSAN_ prefix? Done in v4. > > +void kmsan_clear_page(void *page_addr); > > It would be good to have doc comments for each of them. Done in v4. > > + > > +KMSAN_SANITIZE := n > > +KCOV_INSTRUMENT := n > > Does KMSAN work together with UBSAN? In that case may this needs a > UBSAN_SANITIZE := n Done > > +#include <linux/stackdepot.h> > > +#include <linux/stacktrace.h> > > +#include <linux/types.h> > > +#include <linux/vmalloc.h> > > + > > +#include <linux/mmzone.h> > > Why the space above the mmzone.h include? Removed it, also fixed the include order for this file. > > +/* > > + * Some kernel asm() calls mention the non-existing |__force_order| variable > > + * in the asm constraints to preserve the order of accesses to control > > + * registers. KMSAN turns those mentions into actual memory accesses, therefore > > + * the variable is now required to link the kernel. > > + */ > > +unsigned long __force_order; > > Not sure if this is related, but when compiling with KMSAN I get > > ERROR: "__force_order" [drivers/misc/lkdtm/lkdtm.ko] undefined! > > with a default config with KMSAN selected. Added an EXPORT_SYMBOL to fix this. > > > +bool kmsan_ready; > > +#define KMSAN_STACK_DEPTH 64 > > +#define MAX_CHAIN_DEPTH 7 > > Should these defines be above the variable definitions? Done > > Why not just 'panic("%s: ...", __func__, ...)' ? > > If the BUG() should not be here, then maybe just WARN_ON? Replaced with panic(). > > + > > +/* > > + * TODO(glider): writing an initialized byte shouldn't zero out the origin, if > > + * the remaining three bytes are uninitialized. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Filed https://github.com/google/kmsan/issues/70 to track this. This isn't a showstopper. > > + if (checked && !metadata_is_contiguous(addr, size, META_ORIGIN)) { > > + kmsan_pr_locked("WARNING: not setting origin for %d bytes starting at %px, because the metadata is incontiguous\n", size, addr); > > + BUG(); > > Just panic? Done. > > +/* > > + * TODO(glider): this check shouldn't be performed for origin pages, because > > + * they're always accessed after the shadow pages. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Dropped the TODO. This is somewhat perfectionist. > > + if (origin_p) { > > + kmsan_pr_locked("Origin: %08x\n", *origin_p); > > + kmsan_print_origin(*origin_p); > > + } else { > > + kmsan_pr_locked("Origin: unavailable\n"); > > + } > > These repeated calls to kmsan_pr_locked seem unnecessary. There is > nothing ensuring atomicity of all these print calls w.r.t. reporting. Replaced them with pr_err(). > > +/* Stolen from kernel/printk/internal.h */ > > +#define PRINTK_SAFE_CONTEXT_MASK 0x3fffffff > > Is this used anywhere? No. Removed it. > > +/* Called by kmsan_report.c under a lock. */ > > +#define kmsan_pr_err(...) pr_err(__VA_ARGS__) > > Why is this macro needed? It's never redefined, so in the places it is > used, you can just use pr_err. For readability I would avoid unnecessary > aliases, but if there is a genuine reason this may be needed in future, > I would just add a comment. I've removed the macro. > > +/* Used in other places - doesn't require a lock. */ > > +#define kmsan_pr_locked(...) \ > > + do { \ > > + unsigned long flags; \ > > + spin_lock_irqsave(&report_lock, flags); \ > > + pr_err(__VA_ARGS__); \ > > + spin_unlock_irqrestore(&report_lock, flags); \ > > + } while (0) > > Is this macro needed? The only reason it sort of makes sense is to > serialize a report with other printing, but otherwise pr_err already > makes sure things are serialized properly. Yes, this was the intention. On the other hand, this lock doesn't prevent non-KMSAN code from messing up KMSAN reports, so it makes little sense. Maybe we can just keep the spinlock to separate the reports from each other. > > +enum KMSAN_BUG_REASON { > > + REASON_ANY = 0, > > + REASON_COPY_TO_USER = 1, > > + REASON_USE_AFTER_FREE = 2, > > + REASON_SUBMIT_URB = 3, > > +}; > > Is it required to explicitly assign constants to these? No. Removed the constants. > > +#define LEAVE_RUNTIME(irq_flags) \ > > + do { \ > > + this_cpu_dec(kmsan_in_runtime); \ > > + if (this_cpu_read(kmsan_in_runtime)) { \ > > + kmsan_pr_err("kmsan_in_runtime: %d\n", \ > > + this_cpu_read(kmsan_in_runtime)); \ > > + BUG(); \ > > + } \ > > + restart_nmi(); \ > > + local_irq_restore(irq_flags); \ > > + preempt_enable(); } while (0) > > Could these not be macros, and instead be static __always_inline > functions? Done > > +static void kmsan_context_exit(void) > > +{ > > + int level = this_cpu_read(kmsan_context_level) - 1; > > + > > + BUG_ON(level < 0); > > + this_cpu_write(kmsan_context_level, level); > > +} > > These are not preemption-safe. this_cpu_dec_return followed by the > BUG_ON should be sufficient. Similarly above and below (using > this_cpu_add_return) Good catch, thank you! > > +void kmsan_interrupt_exit(void) > > +{ > > + int in_interrupt = this_cpu_read(kmsan_in_interrupt); > > + > > + BUG_ON(!in_interrupt); > > + kmsan_context_exit(); > > + /* Can't check preempt_count() here, it may be zero. */ > > + this_cpu_write(kmsan_in_interrupt, in_interrupt - 1); > > +} > > +EXPORT_SYMBOL(kmsan_interrupt_exit); > > Why exactly does kmsan_in_interrupt need to be maintained here? I can't > see them being used anywhere else. Is it only for the BUG_ON? Yes, initially some consistency checks made sense. I think it's safe to delete them now. > > +void kmsan_softirq_exit(void) > > +{ > > + bool in_softirq = this_cpu_read(kmsan_in_softirq); > > + > > + BUG_ON(!in_softirq); > > + kmsan_context_exit(); > > + /* Can't check preempt_count() here, it may be zero. */ > > + this_cpu_write(kmsan_in_softirq, false); > > +} > > +EXPORT_SYMBOL(kmsan_softirq_exit); > > Same question here for kmsan_in_softirq. Ditto > > +void kmsan_nmi_exit(void) > > +{ > > + bool in_nmi = this_cpu_read(kmsan_in_nmi); > > + > > + BUG_ON(!in_nmi); > > + BUG_ON(preempt_count() & NMI_MASK); > > + kmsan_context_exit(); > > + this_cpu_write(kmsan_in_nmi, false); > > + > > +} > > +EXPORT_SYMBOL(kmsan_nmi_exit); > > And same question here for kmsan_in_nmi. Ditto. > > + > > +/* > > + * Record a range of memory for which the metadata pages will be created once > > + * the page allocator becomes available. > > + * TODO(glider): squash together ranges belonging to the same page. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Removed the TODO. There's a problem with non-contiguous pages which is tracked at https://github.com/google/kmsan/issues/71 > > + /* > > + * TODO(glider): alloc_node_data() in arch/x86/mm/numa.c uses > > + * sizeof(pg_data_t). > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Resolved this (the code is actually correct) > > + > > + if (IN_RUNTIME()) { > > + /* > > + * TODO(glider): looks legit. depot_save_stack() may call > > + * free_pages(). > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). I've just dropped this if-clause. > > + return; > > + } > > + > > + ENTER_RUNTIME(irq_flags); > > + shadow = shadow_page_for(&page[0]); > > + origin = origin_page_for(&page[0]); > > + > > + /* TODO(glider): this is racy. */ > > Can this be fixed or does the race not matter -- in the latter case, > just remove the TODO and turn it into a NOTE or similar. It doesn't matter. Removed the comment. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg