Re: [PATCH RFC v3 10/36] kmsan: add KMSAN runtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 1. There's a lot of TODOs in the code. They either need to be resolved
> or removed.

Done in v4

> 2. This patch is huge, would it be possible to split it? One way to do
> this is to have two parts, one that adds the headers and empty hooks,
> and the other one that adds hooks implementations. Or something like
> that, if that's feasible at all.

I've split away kmsan_hooks.c, kmsan_entry.c and kmsan_instr.c
Let's see if that helps.

> > + * Adopted from KTSAN assembly hooks implementation by Dmitry Vyukov:
> > + * https://github.com/google/ktsan/blob/ktsan/arch/x86/include/asm/ktsan.h
>
> This link can get stale. Maybe just link the repo?

Guess there's not much code left to credit, only a series of push
instructions. Not sure it's worth it. I've removed this comment.

> > + * KMSAN checks.
>
> This comment just repeats the file name. Maybe worth mentioning what
> exactly we are checking here, and how this header is different from
> kmsan.h. Perhaps some of the functions declared here should be moved
> there as well.

I've expanded the comment a bit, added doc comments and moved the
functions not supposed to be widely used to kmsan.h

> > +struct task_struct;
> > +struct vm_struct;
> > +
> > +
>
> Remove unneeded whitespace.

Done

> > + if (checked) {
> > + kmsan_pr_locked("WARNING: not memsetting %d bytes starting at %px, because the shadow is NULL\n", to_fill, address);
>
> Why not use WARN()?

changed this to panic()

> > +
> > + if (!kmsan_ready)
> > + return 0;
>
> Do we need this check here?
>

right, this is an internal function, callers must ensure we're not in
the runtime.

> > +
> > + if (!id)
> > + return id;
>
> And this one?
>

This one we do need, as there're cases in which the caller may pass a
zero origin to us.

> > + /* Lowest bit is the UAF flag, higher bits hold the depth. */
> > + extra_bits = (depth << 1) | (extra_bits & 1);
>
> Please add some helper functions/macros to work with extra_bits.

Done

> > + if (checked && !metadata_is_contiguous(addr, size, META_ORIGIN)) {
> > + kmsan_pr_locked("WARNING: not setting origin for %d bytes starting at %px, because the metadata is incontiguous\n", size, addr);
>
> Why not use WARN()?

changed this to panic()

> > +
> > +struct kmsan_context_state *task_kmsan_context_state(void);
>
> s/task_kmsan_context_state/kmsan_task_context_state/ or something like that.

changed to kmsan_task_context_state

> > +{
> > + int in_interrupt = this_cpu_read(kmsan_in_interrupt);
> > +
> > + /* Turns out it's possible for in_interrupt to be >0 here. */
>
> Why/how? Expand the comment.

Dropped this function

>
> [...]
>
> > +void kmsan_nmi_enter(void)
> > +{
> > + bool in_nmi = this_cpu_read(kmsan_in_nmi);
> > +
> > + BUG_ON(in_nmi);
> > + BUG_ON(preempt_count() & NMI_MASK);
>
> BUG_ON(in_nmi())?

I've actually dropped context-specific functions, leaving only
kmsan_context_{enter,exit}

> > +/*
> > + * The functions may call back to instrumented code, which, in turn, may call
> > + * these hooks again. To avoid re-entrancy, we use __GFP_NO_KMSAN_SHADOW.
> > + * Instrumented functions shouldn't be called under
> > + * ENTER_RUNTIME()/LEAVE_RUNTIME(), because this will lead to skipping
> > + * effects of functions like memset() inside instrumented code.
> > + */
>
> Add empty line.

Done

> > + LEAVE_RUNTIME(irq_flags);
> > +}
> > +EXPORT_SYMBOL(kmsan_task_create);
> > +
> > +
>
> Remove empty line.

Done

> > + return;
> > +
> > + ENTER_RUNTIME(irq_flags);
> > + if (flags & __GFP_ZERO) {
>
> No {} needed here.

Done

> > + if (s->ctor)
>
> Why don't we poison if there's a ctor? Some comment is needed.

Done

> > + if (!kmsan_ready || IN_RUNTIME())
> > + return;
> > + ENTER_RUNTIME(irq_flags);
> > + if (flags & __GFP_ZERO) {
>
> No {} needed here.

Done

> > + u8 *vaddr;
> > +
> > + if (!skb || !skb->len)
>
> We either need to check !skb before skb_headlen() or drop the check.

Done

> > + kmsan_internal_check_memory(skb->data, skb_headlen(skb), 0, REASON_ANY);
>
> Use start instead of calling skb_headlen(skb) again. Or just remove
> start and always call skb_headlen(skb).

Done

> > + skb_walk_frags(skb, frag_iter)
> > + kmsan_check_skb(frag_iter);
>
> Hm, won't this recursively walk the same list multiple times?

It should not. See the implementation of skb_dump()

> > +}
> > +
> > +extern char _sdata[], _edata[];
>
> #include <asm/sections.h>?

Didn't know that, thanks!

>
> > +
> > +
> > +
>
> Remove excessive whitespace.

Done

> > +
> > + for_each_reserved_mem_region(i, &p_start, &p_end) {
>
> No need for {} here.

Done

>
> > + for_each_online_node(nid)
> > + kmsan_record_future_shadow_range(
> > + NODE_DATA(nid), (char *)NODE_DATA(nid) + nd_size);
>
> Remove tab before (char *)NODE_DATA(nid).

Done

> > + * It's unlikely that the assembly will touch more than 512 bytes.
> > + */
> > + if (size > 512)
>
> Maybe do (WARN_ON(size > 512)) if this is something that we would want
> to detect?

added a WARN_ONCE to that branch

> > + /* Ok to skip address check here, we'll do it later. */
> > + shadow_dst = kmsan_get_metadata(dst, n, META_SHADOW);
>
> kmsan_memmove_metadata() performs this check, do we need it here? Same
> goes for other callers of kmsan_memmove/memcpy_metadata().
>

You're right. I've removed the extra checks.

> > + kmsan_internal_memset_shadow(dst, shadow, n, /*checked*/false);
> > + new_origin = 0;
> > + kmsan_internal_set_origin(dst, n, new_origin);
>
> Do we need variables for shadow and new_origin here?

No. They were here in the hope to make __msan_memset() use shadow of
|c| to initialize |dst|. See https://github.com/google/kmsan/issues/63

> > + if (!kmsan_ready || IN_RUNTIME())
> > + return;
> > +
> > + while (size_copy) {
>
> Why not call kmsan_internal_poison_shadow()/kmsan_internal_memset_shadow()
> here instead of doing this manually?

Done

> > + if (!kmsan_ready || IN_RUNTIME())
> > + return;
> > +
> > + ENTER_RUNTIME(irq_flags);
> > + /* Assuming the shadow exists. */
>
> Why do we assume that shadow exists here, but not in
> __msan_poison_alloca()? Please expand the comment.

We can safely assume that in both cases, and it's a bug if the shadow
doesn't exist. I've removed the misleading comment.

> In some cases the caller of kmsan_print_origin() performs this check
> and prints a differently formatted message (metadata_is_contiguous())
> or no message at all (kmsan_report()). Some unification would be food.
>

Let's just bail out from kmsan_print_origin if the origin is zero.
Only metadata_is_contiguous() may actually pass a zero origin, and the
message there is enough already.

> > + kmsan_pr_err("Local variable description: %s\n", descr);
> > + kmsan_pr_err("Variable was created at:\n");
>
> A shorter way: "Local variable %s created at: ...".

Done

> > + kmsan_pr_err("Uninit was created at:\n");
> > + if (entries)
>
> Should this rather check nr_entries?

SGTM

> > + stack_trace_print(entries, nr_entries, 0);
> > + else
> > + kmsan_pr_err("No stack\n");
>
> KASAN says "(stack is not available)" here. Makes sense to unify with this.

Done

>
> > + break;
> > + }
> > +}
> > +
> > +void kmsan_report(depot_stack_handle_t origin,
> > +   void *address, int size, int off_first, int off_last,
> > +   const void *user_addr, int reason)
>
> It's not really clear what off_first and off_last arguments are, and
> how the range that they describe is different from [address, address +
> size). Some comment would be good.

Added a doc comment to kmsan_report()

> > +
> > + nr_entries = stack_depot_fetch(origin, &entries);
>
> Do we need this here?

No, nor do we need nr_entries and entries


> > +#define has_origin_page(page) \
> > + (!!((page)->origin))
>
> Something like this would take less space:
>
> #define shadow_page_for(page) ((page)->shadow)
> #define origin_page_for(page) ((page)->origin)
> ...

Done

> > + * Dummy load and store pages to be used when the real metadata is unavailable.
> > + * There are separate pages for loads and stores, so that every load returns a
> > + * zero, and every store doesn't affect other stores.
>
> every store doesn't affect other _reads_?

Ack

> > + BUG_ON(is_origin && !IS_ALIGNED(addr64, ORIGIN_SIZE));
> > + if (kmsan_internal_is_vmalloc_addr(addr)) {
>
> No need for {} here.

Done

> > + * none. The caller must check the return value for being non-NULL if needed.
> > + * The return value of this function should not depend on whether we're in the
> > + * runtime or not.
> > + */
> > +void *kmsan_get_metadata(void *address, size_t size, bool is_origin)
>
> This looks very similar to kmsan_get_shadow_origin_ptr(), would it be
> possible to unify them somehow or to split out common parts into a
> helper function?

I've rewritten kmsan_get_shadow_origin_ptr() to use
kmsan_get_metadata(). It might have become a bit slower (still worth
looking into), but less spaghetti code now.


> > + if (kmsan_internal_is_vmalloc_addr(address) ||
> > +     kmsan_internal_is_module_addr(address)) {
>
> No need for {} here.

Done


> > + for (i = 0; i < pages; i++) {
> > + cp = &page[i];
> > + ignore_page(cp);
>
> ignore_page(&page[i])

Done

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux