On Sat, 12 Mar 2022 at 21:14, <andrey.konovalov@xxxxxxxxx> wrote: > > From: Andrey Konovalov <andreyknvl@xxxxxxxxxx> > > Currently, KASAN always uses the normal stack trace collection routines, > which rely on the unwinder, when saving alloc and free stack traces. > > Instead of invoking the unwinder, collect the stack trace by copying > frames from the Shadow Call Stack whenever it is enabled. This reduces > boot time by 30% for all KASAN modes when Shadow Call Stack is enabled. This is impressive. > To avoid potentially leaking PAC pointer tags, strip them when saving > the stack trace. > > Signed-off-by: Andrey Konovalov <andreyknvl@xxxxxxxxxx> > > --- > > Things to consider: > > We could integrate shadow stack trace collection into kernel/stacktrace.c > as e.g. stack_trace_save_shadow(). However, using stack_trace_consume_fn > leads to invoking a callback on each saved from, which is undesirable. > The plain copy loop is faster. Why is stack_trace_consume_fn required? This is an internal detail of arch_stack_walk(), but to implement stack_trace_save_shadow() that's not used at all. I think having stack_trace_save_shadow() as you have implemented in kernel/stacktrace.c or simply in kernel/scs.c itself would be appropriate. > We could add a command line flag to switch between stack trace collection > modes. I noticed that Shadow Call Stack might be missing certain frames > in stacks originating from a fault that happens in the middle of a > function. I am not sure if this case is important to handle though. I think SCS should just work - and if it doesn't, can we fix it? It is unclear to me what would be a deciding factor to choose between stack trace collection modes, since it is hard to quantify when and if SCS doesn't work as intended. So I fear it'd just be an option that's never used because we don't understand when it's required to be used. > Looking forward to thoughts and comments. > > Thanks! > > --- > mm/kasan/common.c | 36 +++++++++++++++++++++++++++++++++++- > 1 file changed, 35 insertions(+), 1 deletion(-) > > diff --git a/mm/kasan/common.c b/mm/kasan/common.c > index d9079ec11f31..65a0723370c7 100644 > --- a/mm/kasan/common.c > +++ b/mm/kasan/common.c > @@ -9,6 +9,7 @@ > * Andrey Konovalov <andreyknvl@xxxxxxxxx> > */ > > +#include <linux/bits.h> > #include <linux/export.h> > #include <linux/init.h> > #include <linux/kasan.h> > @@ -21,6 +22,7 @@ > #include <linux/printk.h> > #include <linux/sched.h> > #include <linux/sched/task_stack.h> > +#include <linux/scs.h> > #include <linux/slab.h> > #include <linux/stacktrace.h> > #include <linux/string.h> > @@ -30,12 +32,44 @@ > #include "kasan.h" > #include "../slab.h" > > +#ifdef CONFIG_SHADOW_CALL_STACK > + > +#ifdef CONFIG_ARM64_PTR_AUTH > +#define PAC_TAG_RESET(x) (x | GENMASK(63, CONFIG_ARM64_VA_BITS)) This should go into arch/arm64/include/asm/kasan.h, and here it should then just do #ifndef PAC_TAG_RESET #define ... > +#else > +#define PAC_TAG_RESET(x) (x) > +#endif But perhaps there's a better, more generic location for this macro? > +static unsigned int save_shadow_stack(unsigned long *entries, > + unsigned int nr_entries) > +{ > + unsigned long *scs_sp = task_scs_sp(current); > + unsigned long *scs_base = task_scs(current); > + unsigned long *frame; > + unsigned int i = 0; > + > + for (frame = scs_sp - 1; frame >= scs_base; frame--) { > + entries[i++] = PAC_TAG_RESET(*frame); > + if (i >= nr_entries) > + break; > + } > + > + return i; > +} > +#else /* CONFIG_SHADOW_CALL_STACK */ > +static inline unsigned int save_shadow_stack(unsigned long *entries, > + unsigned int nr_entries) { return 0; } > +#endif /* CONFIG_SHADOW_CALL_STACK */ > + > depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc) > { > unsigned long entries[KASAN_STACK_DEPTH]; > unsigned int nr_entries; > > - nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0); > + if (IS_ENABLED(CONFIG_SHADOW_CALL_STACK)) > + nr_entries = save_shadow_stack(entries, ARRAY_SIZE(entries)); > + else > + nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0); > return __stack_depot_save(entries, nr_entries, flags, can_alloc); > } > > -- > 2.25.1 >